CS2855 Normalisation

CS2855 Database Normalization Revision Notes

1. Functional Dependencies (FDs)

  • Definition: A FD ๐›ผ โ†’ ๐›ฝ asserts that if two tuples agree on ๐›ผ, they must agree on ๐›ฝ.
  • Armstrongโ€™s Axioms:
    1. Reflexivity: If ๐›ฝ โІ ๐›ผ, then ๐›ผ โ†’ ๐›ฝ (e.g., AB โ†’ A).
    2. Augmentation: If ๐›ผ โ†’ ๐›ฝ, then ๐›พ๐›ผ โ†’ ๐›พ๐›ฝ (e.g., A โ†’ C โŸน AB โ†’ BC).
    3. Transitivity: If ๐›ผ โ†’ ๐›ฝ and ๐›ฝ โ†’ ๐›พ, then ๐›ผ โ†’ ๐›พ.

2. Closure of FDs

  • Closure (Fโบ): All FDs logically implied by a set F.
  • Attribute Closure (๐›ผโบ): Set of attributes determined by ๐›ผ using F.
  • Algorithm to Compute ๐›ผโบ:

Initialize result = ๐›ผ
Repeat:
For each FD ๐›ฝ โ†’ ๐›พ in F:
If ๐›ฝ โІ result, add ๐›พ to result
Until no more changes
Return result

3. Redundant Attributes & Canonical Cover

  • Redundant Attribute: An attribute in an FD that can be removed without changing Fโบ.
  • Example: Given F = {A โ†’ C, AB โ†’ CD}, C is redundant in AB โ†’ CD because {A โ†’ C, AB โ†’ D} โŸน AB โ†’ CD.
  • Canonical Cover:
  1. Decompose FDs to single RHS (e.g., A โ†’ BC becomes A โ†’ B, A โ†’ C).
  2. Remove redundant LHS attributes (e.g., ABC โ†’ D becomes AB โ†’ D if C is redundant).
  3. Eliminate redundant FDs (e.g., if A โ†’ B is implied by other FDs).

4. Decomposition

  • Lossless Decomposition: Original relation can be recovered by a natural join of sub-relations.
  • Check Lossless:
    • If (R1 โˆฉ R2) is a superkey for R1 or R2.
  • Dependency Preservation: All FDs in Fโบ can be checked in the decomposed relations.

5. Normal Forms

a) First Normal Form (1NF)

  • All attributes are atomic (no multi-valued/composite attributes).

b) Boyce-Codd Normal Form (BCNF)

  • Definition: For every non-trivial FD ๐›ผ โ†’ ๐›ฝ in Fโบ, ๐›ผ is a superkey.
  • Decomposition Algorithm:
  • Find a violating FD ๐›ผ โ†’ ๐›ฝ (where ๐›ผ is not a superkey).
  • Decompose into (๐›ผ โˆช ๐›ฝ) and (R โ€“ ๐›ฝ).
  • Repeat until all relations are in BCNF.

c) Third Normal Form (3NF)

  • Definition: For every non-trivial FD ๐›ผ โ†’ ๐›ฝ in Fโบ:
  • ๐›ผ is a superkey, OR
  • ๐›ฝ is a prime attribute (part of a candidate key).
  • Advantage: Always preserves dependencies.

BCNF vs 3NF

  • BCNF eliminates all redundancy but may lose dependencies.
  • 3NF compromises to preserve dependencies.

Past Paper Question Types & Solutions

Type 1: Identify Normal Form

Steps:

  1. List all FDs.
  2. Find candidate keys.
  3. For each FD, check:
  • BCNF: If ๐›ผ is a superkey.
  • 3NF: If ๐›ฝ is prime or ๐›ผ is a superkey.

Example (CS2855-22):
Given R(A,B,C,D) with FDs AB โ†’ C, C โ†’ D:

  • Candidate keys: AB (determines C and D via FDs).
  • Violating FD: C โ†’ D (C is not a superkey).
  • Not in BCNF (C is not a superkey) but satisfies 3NF (D is prime? No โ†’ Check if D is prime).

Type 2: Decompose into BCNF/3NF

Example:
Decompose R(A,B,C,D) with FDs AB โ†’ C, C โ†’ D into BCNF.

  1. Find violating FD: C โ†’ D (C is not a superkey).
  2. Decompose:
  • R1 = (C,D)
  • R2 = (A,B,C)
  1. Check: R1 is in BCNF (key = C), R2 (key = AB โ†’ C, satisfies BCNF).

Type 3: Check Lossless Join

Example:
Decomposition R1(A,B), R2(B,C) where B โ†’ A.

  • R1 โˆฉ R2 = B.
  • B is a key for R1 (B โ†’ A).
  • Lossless as B is a key in R1.

Important Notes

  • E-R to Relational Model: Translate entities to tables, relationships to foreign keys.
  • SQL Queries:
  • GROUP BY is used with aggregate functions (SUM, AVG).
  • HAVING filters groups; WHERE filters rows.
  • Example:
    1
    2
    3
    4
    5
    SELECT dept, AVG(salary) 
    FROM employees
    WHERE age > 30
    GROUP BY dept
    HAVING AVG(salary) > 50000;
  • Transaction ACID: Ensure atomicity, consistency, isolation, durability.

CS2855 Normalisation
https://blog.pandayuyu.zone/2025/05/13/CS2855-Normalisation/
Author
Panda
Posted on
May 13, 2025
Licensed under