> Normal Forms(NF) of database theory provides the criteria for determining the table’s degree of vulnerability to logical inconsistencies and anomalies
> The higher the normal form is applicable to the table, the less it is vulnerable to inconsistencies and anomalies
> Forms are 1NF 2NF 3NF and BCNF which we will discuss now
First Normal form – 1NF
For a relational database table to adhere to 1NF, the following criteria must meet :
1.) There’s no top-to-bottom ordering of the rows.
2.) There’s no left-to-right ordering of the columns
3.) There are no duplicate values
4.) Every row-and-column intersection contains exactly one value from the applicable domain
5.) All columns are regular(i.e. rows have no hidden component)
In order to understand the 2NF, lets see what candidate key means
candidate key of a relation is a minimal superkey for that relation, a set of attributes, such that
a.) the relation does not have two distinct tuple with same values for these set of attributes
b.) there is no proper subset of these attributes for which (a.) holds.
Second Normal Form – 2NF
A table in 1NF must meet an additional criteria to qualify for 2NF.
A table in 1NF is in 2NF if and only if , given a candidate key and any attribute that is not a constituent of candidate key, the non-key attribute depends upon whole of the candidate key and no just a part of it
Note : A point to note here is if a table in 1NF doesn’t contains composite candidate keys, the able is automatically in 2NF
Third Normal Form – 3NF
A Table is in 3NF if and only if the following holds
a.) the table is in 2NF
b.) every non-prime attribute of relation R is non-transitively dependent
An example of a 2NF table that fails to meet the requirements of 3NF is:
Tournament Winners
| Tournament |
Year |
Winner |
Winner Date of Birth |
| Indiana Invitational |
1998 |
Al Fredrickson |
21 July 1975 |
| Cleveland Open |
1999 |
Bob Albertson |
28 September 1968 |
| Des Moines Masters |
1999 |
Al Fredrickson |
21 July 1975 |
| Indiana Invitational |
1999 |
Chip Masterson |
14 March 1977 |
Because each row in the table needs to tell us who won a particular Tournament in a particular Year, the composite key {Tournament, Year} is a minimal set of attributes guaranteed to uniquely identify a row. That is, {Tournament, Year} is a candidate key for the table.
The breach of 3NF occurs because the non-prime attribute Winner Date of Birth is transitively dependent on the candidate key {Tournament, Year} via the non-prime attribute Winner. The fact that Winner Date of Birth is functionally dependent on Winner makes the table vulnerable to logical inconsistencies, as there is nothing to stop the same person from being shown with different dates of birth on different records.
In order to express the same facts without violating 3NF, it is necessary to split the table into two:
Tournament Winners
| Tournament |
Year |
Winner |
| Indiana Invitational |
1998 |
Al Fredrickson |
| Cleveland Open |
1999 |
Bob Albertson |
| Des Moines Masters |
1999 |
Al Fredrickson |
| Indiana Invitational |
1999 |
Chip Masterson |
|
Player Dates of Birth
| Player |
Date of Birth |
| Chip Masterson |
14 March 1977 |
| Al Fredrickson |
21 July 1975 |
| Bob Albertson |
28 September 1968 |
|
we will talk about the BCNF in the next post …