Normalization

From Canonica AI

Introduction

Normalization is a process in database management system (DBMS) which is used to design a relational database. The main aim of normalization is to add, delete or modify field that can be made in a single table. It also eliminates redundancy and prevents data from dependency.

A relational database schema illustrating the concept of normalization
A relational database schema illustrating the concept of normalization

History

The concept of normalization was first developed by Edgar F. Codd at IBM in 1971. His paper, "A Relational Model of Data for Large Shared Data Banks," introduced the concept of normalization and was the basis for the relational database model.

Purpose of Normalization

Normalization is used to minimize the redundancy from a relation or set of relations. It is also used to eliminate the undesirable characteristics like Insertion, Update and Deletion Anomalies. Normalization rules are divided into the following normal forms:

  • First Normal Form (1NF)
  • Second Normal Form (2NF)
  • Third Normal Form (3NF)
  • Boyce-Codd Normal Form (BCNF)
  • Fourth Normal Form (4NF)
  • Fifth Normal Form (5NF)

First Normal Form (1NF)

A relation is in first normal form if the domain is atomic for every attribute. All tuples in a relation have the same domains. Each attribute value is atomic.

Second Normal Form (2NF)

A relation is in second normal form if it is in 1NF and every non-prime attribute is fully functionally dependent on the primary key.

Third Normal Form (3NF)

A relation is in third normal form if it is in 2NF and no non-prime attribute is transitively dependent on the primary key.

Boyce-Codd Normal Form (BCNF)

A relation is in Boyce-Codd normal form if for every one of its dependencies X → Y, at least one of the following conditions hold:

  • X → Y is a trivial functional dependency (Y ⊆ X)
  • X is a superkey for schema R

Fourth Normal Form (4NF)

A table is in 4NF if it is in BCNF and it has no multi-valued dependencies.

Fifth Normal Form (5NF)

A table is in 5NF, also known as Project-Join Normal Form (PJNF), if it is in 4NF and every join dependency in the table is a consequence of the candidate keys of the table.

Advantages of Normalization

Normalization offers a number of advantages, including the following:

  • Greater overall database organization
  • Reduction of redundant data
  • Data consistency within the database
  • A much more flexible database design
  • A better handle on database security
  • Increased storage efficiency
  • Better access paths for the database

Disadvantages of Normalization

While normalization offers many benefits, it also has some disadvantages, such as:

  • Increased complexity
  • More difficult database design
  • Increased cost of execution for queries
  • Increased storage requirements

Conclusion

Normalization is a crucial aspect of database design. A good understanding of the normalization process provides the designer with the tools to create databases that are both efficient and effective.

See Also