Functional Dependencies, Armstrong's rule, & Normalization
What are Functional Dependencies?
- Functional dependencies in database management refer to relationships between attributes in a table
- where the value of one or more attributes determines the value of another attribute(s).
Process of Functional Dependencies
Identification
Identify dependencies between attributes based on how their values relate to each other.
Representation
Represent dependencies using functional dependency notation (X -> Y), where X determines Y.
Normalization
Utilize functional dependencies to normalize the database schema, reducing redundancy and improving data integrity.
Example
- Consider a table "Employees" with attributes (EmployeeID, Name, DepartmentID, DepartmentName).
- Here, DepartmentName is functionally dependent on DepartmentID, as each DepartmentID uniquely determines the DepartmentName.
- The functional dependency can be represented as DepartmentID -> DepartmentName.
Importance
Data Integrity
Ensures data consistency and accuracy by avoiding inconsistent or redundant data entries.
Database Design
Guides the design process, helping to organize data efficiently and minimize data redundancy.
Normalization
Forms the basis for normalization techniques like 1NF, 2NF, 3NF, etc., ensuring databases adhere to standard forms.
Limitations
- Complexity Managing functional dependencies can become complex in large databases with numerous attributes and relationships.
- Performance Impact: Excessive normalization based on functional dependencies can impact query performance due to increased join operations.
Armstrong's Inference Rule
- Armstrong's inference rules are a set of rules used in database normalization and dependency analysis.
- These rules help identify functional dependencies between attributes,
- aiding in the normalization process to eliminate data redundancy and improve data integrity.
- Armstrong's inference rules are based on logical implications and can be
- expressed in the form of implications, guiding the analysis of functional dependencies in a database schema.
Rules
Reflexivity (Reflexive Rule)
- If Y is a subset of X, then X -> Y holds.
- Example: If {A, B} -> {A}, then {A, B} -> {A, B} holds.
Augmentation (Augmentation Rule)
- If X -> Y holds, then XZ -> YZ holds for any Z.
- Example: If {A} -> {B}, then {A, C} -> {B, C} holds.
Transitivity (Transitive Rule)
- If X -> Y and Y -> Z hold, then X -> Z holds.
- Example: If {A} -> {B} and {B} -> {C}, then {A} -> {C} holds.
Union (Union Rule)
- If X -> Y and X -> Z hold, then X -> YZ holds.
- Example: If {A} -> {B} and {A} -> {C}, then {A} -> {B, C} holds.
Decomposition (Decomposition Rule)
- If X -> YZ holds, then X -> Y and X -> Z hold.
- Example: If {A} -> {B, C}, then {A} -> {B} and {A} -> {C} hold.
Importance of Armstrong's Inference Rule
Dependency Analysis
Armstrong's rules are fundamental in analyzing functional dependencies and identifying key relationships in a database schema.
Normalization
These rules guide the normalization process, ensuring databases adhere to higher normal forms like 2NF, 3NF, BCNF, etc.
Data Integrity
By reducing data redundancy and anomalies, applying Armstrong's rules improves data integrity and consistency.
Data Normalization in DBMS
- Normalization is the process of organizing data in a database to eliminate redundancy and dependency anomalies.
- It involves dividing large tables into smaller, more manageable ones and defining relationships between them to achieve data integrity and efficiency.
- Reduce data redundancy : Store each piece of information in one place to avoid repetition.
- Minimize data modification anomalies : Ensure that updates or deletions in data do not cause inconsistencies.
- Improve data integrity : Maintain accurate and reliable data across the database.
- Enhance database structure : Organize data logically to improve query performance and maintainability.
Normal Forms
- First Normal Form (1NF)
- Second Normal Form (2NF)
- Third Normal Form (3NF)
- Boyce-Codd Normal Form (BCNF)
First Normal Form (1NF)
- Each table has a primary key.
- All attributes are atomic (indivisible).
- Example: Consider a table "Students" with columns (StudentID, Name, Courses).
- Break "Courses" into a separate table "StudentCourses" with columns (StudentID, Course).
Second Normal Form (2NF)
- Must be in 1NF.
- Eliminate partial dependencies by moving attributes that depend on part of the primary key to separate tables.
- Example: In "Student Courses," if (StudentID, Course) is the composite primary
- key and "Course" depends only on "Course," create a "Courses" table with columns (Course, CourseName).
Third Normal Form (3NF)
- Must be in 2NF.
- Eliminate transitive dependencies by moving non-key attributes that depend on other non-key attributes to separate tables.
- Example: If "Course" depends on "Department," create a "Departments" table with columns (DepartmentID, DepartmentName) and link it to the "Courses" table.
Boyce-Codd Normal Form (BCNF)
- Must be in 3NF.
- Every determinant is a candidate key.
- Example: If there are functional dependencies like StudentID -> Name and StudentID, Course -> Grade, ensure StudentID is the candidate key, and Course is functionally dependent on StudentID.
Example
Step 1: Create the Initial Tables
1-- Create Employees table
2CREATE TABLE Employees (
3 EmpID INT PRIMARY KEY,
4 Name VARCHAR(100) NOT NULL,
5 DeptID INT,
6 DeptName VARCHAR(100)
7);
8
9-- Create Departments table
10CREATE TABLE Departments (
11 DeptID INT PRIMARY KEY,
12 DeptName VARCHAR(100),
13 ManagerID INT,
14 ManagerName VARCHAR(100)
15);
Step 2: Normalize the Employees Table
1-- Create a new table for Employee Details
2CREATE TABLE EmployeeDetails (
3 EmpID INT PRIMARY KEY,
4 DeptName VARCHAR(100),
5 FOREIGN KEY (EmpID) REFERENCES Employees(EmpID)
6);
7
8-- Drop redundant column DeptName from Employees table
9ALTER TABLE Employees DROP COLUMN DeptName;
10
Step 3: Normalize the Departments Table
1-- Create a new table for Managers
2CREATE TABLE Managers (
3 ManagerID INT PRIMARY KEY,
4 ManagerName VARCHAR(100),
5 FOREIGN KEY (ManagerID) REFERENCES Employees(EmpID)
6);
7
8-- Drop redundant columns ManagerID and ManagerName from Departments table
9ALTER TABLE Departments DROP COLUMN ManagerID;
10ALTER TABLE Departments DROP COLUMN ManagerName;
This normalization process ensures that data is organized efficiently, eliminating redundancy and maintaining data integrity.
Conclusion
Now we have basic understanding of Data Normalization: Functional dependencies, Armstrong's inference rule, & Normalization (Upto BCNF).