Standardized relational database

Tram Ho

Recently, I was asked by a friend to do an exercise in database design according to 3NF standards, to be honest, these were the university studies and I have forgotten until now ?)). So I decided to write an article to note what I have read and to refer to later.


Most web applications need to have a database to store data, process information, render reports, assist search … When data becomes the soul of the application. web, the web application works well first of all must have an efficient storage database. I used to do big exercises but I was eager to get into the code, and then after a small change, I had to rebuild the DB from the beginning. From there, I draw one thing, database design is an extremely important part, if carefully designed, then you will save a lot of time in the development process. And to optimize the database, you should follow the design standards that I have listed below.

The purpose of database standardization

As I said at the beginning, database normalization has two purposes:

1. Giảm thiểu dư thừa dữ liệu

2. Loại bỏ các bất thường khi cập nhật cơ sở dữ liệu

Types of database normalization

There are 4 basic types of normalization:

  • First Normal Form (1NF) : 1NF standard format
  • Second Normal Form (2NF) : 2NF standard format
  • Third Nomal Form (3NF) : 3NF standard format
  • Boyce-Codd Normal Form (BCNF) : Boyce-Codd standard format

The standard forms are arranged in order from low to high. To standardize 2NF, your database must meet the 1NF standard, similarly if it meets the 3NF standard, it must meet the 1NF and 2NF standards. And Boyce-Codd standard will include 3 types of standards 1NF, 2NF and 3NF.

1NF standard format

Example 1 non-standardized table: The table has 3 primary keys: customer_id , order_id and product_id .

This data sheet violates the conditions of the 1NF standard because: address contains duplicate values, moreover, the address value in each row is not univalent (only 1 value), in addition, belonging to The total_amount calculation can be calculated by quantity * unit_price , not necessarily included in the table, causing data redundancy. Through the above comment, we can imagine 3 conditions that must be followed:

  • The properties of the table must be prime
  • The values ​​of the attributes on the rows must be unitary, not repeating groups
  • There is no one value that can be computed from another attribute

From there, we can redesign the above data table as follows:

  • Separate repeating attributes in the table such as customer_name , phone into a new table, customers
  • Separate the address into a separate table with the key customer_id to know which address belongs to which customer .
  • Remove total_amount attribute The result is as follows:

2NF standard format

Normalized rule from 1NF standard to 2NF:

  • Step 1: Eliminate keyless attributes that depend on a primary key component and split into a separate table, the primary key of the table is the part of the key they depend on.
  • Step 2: The remaining properties form a relationship, its primary key is the original primary key.

The new data table that we designed has not yet reached the 2NF standard because: some attributes such as description , unit_price depend on part of the key as product_id , not depending on the key set ( customer_id , order_id , product_id ), or the customer_name and phone properties only depend on customer_id , order_date attribute depends on customer_id and order_id , quantity attribute depends on order_id and product_id .

So to achieve 2NF standard, we will design the standard 1NF data table as follows:

  • Split properties ( product_id , description , unit_price ) into a separate table of products .
  • The attributes ( customer_id , order_id , order_date ) make a table, I named orders .
  • The remaining properties ( order_id , product_id , quantity ) make an intermediate table between products and orders , I set order_products .

Just follow the 2 standards that have been standardized database as follows:

3NF standard format

Condition :

  • Must meet 2NF standard
  • All non-key properties depend on transitive properties (meaning that all non-key properties must be inferred directly from the key property).

Normalized rule from 2NF to 3NF:

  • Step 1: Remove the transitive dependent properties from the relationship and separate them into a private key whose key is the bridging attribute.
  • Step 2: The remaining properties form a relationship whose primary key is the primary key.

Notice that the database we designed in the 2NF standard has also reached the 3NF standard. So I will take another example for your reference as follows:

For example, the following table violates 3NF standard:

We see the country_name attribute depends on country_id , and country_id depends on the primary key is id . So we should split the above table into the following 2 tables:

Boyce-Codd standard format

Condition :

  • Must meet 3NF standard
  • No key attribute depends on the keyless attribute

Normalized 3NF rule to Boyce-Codd:

  • Step 1: Remove function-dependent key properties from the non-key attribute from the relationship
  • Step 2: Split the just-removed attribute into a private key whose primary key is the non-key property that causes dependencies.

I have not thought of a real example, so I would like to describe the following: Assume the table has attributes (A, B, C, D, E, F) with AB being the primary key AB => C AB => D AB = > EF => B We see that the above table violates the Boyce-Codd standard because B is a key attribute but depends on F as a keyless attribute. From there, we will separate the table to follow the standard as follows: Table 1: (A, F, C, D, E) Table 2: (F, B) I see usually only need to follow the 3 standards 1NF -> 3NF is My database has been optimized already. This standard I raised for the purpose of more reference and 1 more higher standards I have not mentioned. Maybe later you have to design big data, you may make the mistakes mentioned above. And if you find an example of Boyce-Codd substandard, please comment for your reference.

My article here is the end, although there are many shortcomings but hope to bring a little knowledge for newcomers to embark on their own code. Thank you!


Share the news now

Source : Viblo