Data Normalisation

Data normalisation is a method widely employed for defining entities and their structure in a way that minimises storage space requirements and database maintenance. For example, the following customer order line entity or record could be defined:

Customer order line record
1 Customer number
Name
Order number
Line number
Material Code
Quantity
Description

The 1 indicates that the customer number is the key to the record. The first stage in normalisation is to take out the item that repeats most frequently and treat it as a separate record, but include a key to it in the original record. A customer may place several orders, so the first normal form is represented as follows:

Customer record

Order line record

1 Customer number

1 Customer number

Name

2 Order number

 

Line number

 

Material Code

 

Quantity

 

Description

1 and 2 indicate that the order record now has a compound key, that is a key with more than one part, comprising the customer number plus the order number. These partial key dependencies are removed when going from first normal form to second normal form. This is done by examining records having a compound key and checking to see whether each data field in the record relates to the whole key. If a field relates to part of the key only, it is removed with its key to form another record. In our example Line number, Material code, Quantity and Description relate to the Order number but not the Customer number. So they can be extracted into a separate Order line record, as follows:

Customer record

Order record

1 Customer number

1 Customer number

Name

2 Order number

 

 

Order line record

 

1 Order number

 

2 Line number

 

Material Code

 

Quantity

 

Description

 

The final stage of the analysis, the reduction to third normal form, involves examining each record to see whether any items are mutually dependent. If there are any, they are removed to a separate record, leaving one of the items behind in the original record and using that as the key in the newly extracted record. In the example, Material code and Description are mutually dependent because Material code will always have the same Description. Here is the final definition:

Customer record

Order record

1 Customer number

1 Customer number

Name

2 Order number

 

 

Order line record

Material record

1 Order number

1 Material code

2 Line number

Description

Material Code

 

Quantity

 

A test can now be applied to ensure that the records are in third normal form. The following questions are asked:

If the answer is 'Yes' in both cases, the records are in third normal form.