Summary of database standardization

Tram Ho

Goal of standardization

Removing data redundancy Removing update anomaly Removing insertion anomaly Removing deletion anomaly

Data normalization forms

Standardization is the process of separating tables (decomposing) into smaller tables based on functional dependencies. Standard forms are instructions for designing tables in the database.

The purpose of standardization is to eliminate data redundancies and errors when manipulating redundancy and errors when manipulating data (Insert, Delete, Update). But standardization increases query time.

Normalized forms (Normal Form)

Standard 1 – 1NF (First Normal Form)

Definition: A table (relationship) is called in the normal form 1NF if and only if the entire range of values ​​of the columns present in the table (relations) only contain atomic values ​​(elements)

For example:

A table (relationship) is not yet at 1NF:

MASVHOTENADDRESSMAMONTENMONDIEM
A01Le Na12 Thai HaM01M02CSDLAnh89
A02Tran An56 Ma MayM01Database8
A03Ha Nam24 Wooden BridgeM01M02M03CSDLAnhToán 1689

Standard form 2 – 2NF

Definition A relation in the standard form 2NF if that relationship:

  • Is 1NF
  • Non-key attributes must be fully dependent on the primary key

For example

  • Example1: For relation R = (ABCD), the key is AB and the dependent set of F = {AB -> C, AB -> D} is the relation of standard 2NF.
  • Example 2: For relation R = (ABCD), key is AB and functional dependency set

F = {AB -> C, AB -> D, B -> DC} is a non-standard relationship of 2NF because of a functional dependency

B -> DC is a function dependency (dependent on incomplete functions) on the key. Then, we bring back the standard form 2NF as follows:

Comment

A relationship is in the form of 2NF if it satisfies one of the following conditions: The primary key contains only one attribute The table does not have non-key attributes All non-key attributes depend entirely on the set of primary key attributes

Standard type 3 – 3NF

Definition of a relationship in the standard 3NF form if that relationship:

  • Is 2NF
  • Non-key attributes must depend directly on the primary key

For example

  • Example1: For relation R = (ABCDGH, key is AB and dependent set of function F = {AB -> C, AB -> D, AB -> GH} is the relation reaching standard 3NF.
  • Example 2: For relation R = (ABCDGH), key is AB and functional dependency set

F = {AB -> C, AB -> D, AB -> GH, G -> DH} is a non-standard relationship of 3NF because there is a G-DH dependency that depends on the indirect function on the key. Then we bring the standard form 3NF as follows:

Build multi-lingual database
The Ministry of Finance will test AI and Big data for tax and customs management

BCNF Standard Form (Boyce Codd Normal Form)

Definition of a relationship in the BCNF standard form if that relationship:

  • Is 3NF
  • There is no key attribute that depends on the function key.

For example

  • Example 1: For relation R = (ABCDGH, key is AB and dependent set of function F = {AB -> C, AB -> D, AB -> GH} is a relationship with BCNF standard.
  • Example 2: For relation R = (ABCDGH), key is AB and functional dependency set

F = {AB -> C, AB -> D, AB -> GH, H -> B} is a BCNF unqualified relationship because there is a key B dependent attribute on the non-key attribute H. Then we give on BCNF standard form as follows:

Standardized method

The basic principle in standardizing databases is to eliminate data redundancy by disintegrating relationships but not losing information. Tutorial about database standardization

Tutorial 1 – Use flattening method to standardize 1NF

There is a very good tutorial on standardizing databases here and because this is a Vietnamese language information technology site, that tutorial also has a Vietnamese version for those who do not like English. Please discuss the standardization of databases in this topic.

Tutorial 2 – Use iterative grouping method to standardize 1NF

This is the fourth subject of Software Engineering term 7 course 49 of the DT12 Faculty of Electronic Engineering Department of Hanoi University of Technology. Different from tutorial 1 above using flattening method, I use iterative grouping method to bring about 1NF form first. For an iterative grouping method, see also tutorial 3.

Requirements: Database design for drug store management needs the following information:

DB (Invoice Code, Sale Date, Total Invoice, Customer Code, Customer Name, Customer Number, Employee Code, CMT Number of NV, NV Name, NV Address, Employee Number (Drug Code) , Drug name, Uses, Production date, SD term, Quantity, Amount)).

Function dependencies:

  • Invoice code, Customer code, Employee code, Medicine code, uniquely identify 1 bill, 1 customer, 1 NV, 1 drug.
  • An invoice may have many drugs but only one buyer and one seller.
  • CMT number of NV-> Name NV, Address NV

1NF – Remove repeating groups and remove calculated properties.

Need type 2 following calculation attributes: Total bill and money amount. Because this is redundant data.

  1. Invoice (Invoice code, Sale date, Customer code, Customer Name, Customer number, Employee code, CMT number of NV, NV name, NV address, Phone number NV)
  2. Invoice – Drug (Invoice code, Medicine code, Drug name, Uses, Production date, SD term, Quantity)

2NF – Eliminate functional dependencies not entirely on the primary key

The Invoice relationship has only a single key, so this relationship is at 2NF.

Considering the relationship between Invoices and Drugs: Drug name, Uses, Production date, SD term depends only on the Drug Code but does not depend on the whole course so this relationship is separated into the following two relations:

  • Invoice – Medication (Invoice code, Medicine code, Quantity)
  • Drugs (Drug code, medicine name, Uses, Production date, SD term)

Thus, in the form of 2NF we have 3 relations:

  1. Invoice (Invoice code, Sale date, Customer code, Customer Name, Customer number, Employee code, CMT number of NV, NV name, NV address, Phone number NV)
  2. Invoice – Medication (Invoice code, Medicine code, Quantity)
  3. Drugs (Drug code, medicine name, Uses, Production date, SD term)

3NF – Removes bridging dependencies into primary keys

In the Invoice relationship, we see Customer Name, Customer Number depending only on Customer Code. CMT number of NV, Name NV, NV Address only depends on employee code. Therefore this relationship is separated into the following three relations:

  • Invoice (Invoice code, Sale date, Customer code, Employee code)
  • Customer (Customer Code, Customer Name, Customer Number)
  • Staff (Employee Code, CMT Number of NV, Name of NV, Address of NV, Phone Number NV)

Thus, at 3NF, we have the following 5 relations:

  1. Invoice (Invoice code, Sale date, Customer code, Employee code)
  2. Customer (Customer Code, Customer Name, Customer Number)
  3. Staff (Employee Code, CMT Number of NV, Name of NV, Address of NV, Phone Number NV)
  4. Invoice – Medication (Invoice code, Medicine code, Quantity)
  5. Drugs (Drug code, medicine name, Uses, Production date, SD term)

Tutorial 3 – Standardizing on BCNF

Here, there is the database textbook (in English), which refers to standardization very easy to understand.

Deeply presented about anomaly, how the standard forms solve anomaly.

In general, standardizing a database is a matter of knowing while studying, because in addition to the fact that the higher the standard is, the more redundant the database is – Depending on the purpose of use, it is standardized or not.

Filter customer Insights with data – the decision maker’s decision factor
Steps to standardize basic database
Share the news now

Source : viblo