Skip to content
ADevGuide Logo ADevGuide
Go back

Database Normalization Explained: 1NF, 2NF, and 3NF with Examples

By Pratik Bhuite | 16 min read

Hub: Java / Interview Fundamentals

Series: Database Fundamentals

Last verified: Apr 17, 2026

Part 6 of 10 in the Database Fundamentals

Key Takeaways

On this page
Reading Comfort:

Database Normalization Explained: 1NF, 2NF, and 3NF

Database normalization is the process of structuring tables to reduce data redundancy and prevent data anomalies. Without it, a single schema mistake can make your data inconsistent in ways that are difficult to detect and expensive to fix.

This guide walks through the three most commonly used normal forms - 1NF, 2NF, and 3NF - with concrete before-and-after table examples. If you are building your database fundamentals, pair this with Relational Databases Explained: Tables, Rows, and Keys, Primary Key vs Foreign Key Explained with Examples, and What Are Database Indexes? How They Improve Performance. For the full roadmap, use the Database Fundamentals series as the pillar page and the Database tag as the category page.

Table of Contents

Open Table of Contents

What Is Database Normalization?

Database normalization is a systematic approach to organizing columns and tables in a relational database. The goal is to eliminate redundant data (storing the same fact more than once) and ensure data dependencies are logical (only storing related data in a table).

The process is built on a series of rules called normal forms. Each normal form is cumulative - to be in 2NF, a table must already satisfy 1NF. To be in 3NF, it must already satisfy 2NF.

flowchart TD
    A[Unnormalized Table] --> B[1NF\nAtomic values + unique rows]
    B --> C[2NF\nNo partial dependencies]
    C --> D[3NF\nNo transitive dependencies]
    D --> E[BCNF\nEvery determinant is a candidate key]

In most production systems, reaching 3NF is the practical target.

Why Normalization Matters: Data Anomalies

Without normalization, poorly structured tables cause three types of anomalies.

Insert Anomaly

You cannot add data without also adding unrelated data.

order_idcustomer_namecustomer_cityproduct_nameprice
101AliceNew YorkKeyboard49.99
102BobChicagoMonitor299.99

If you want to add a new customer before they place an order, you have no row to put them in because order_id is required.

Update Anomaly

Changing one fact requires updating multiple rows.

If Alice moves from New York to Austin, you must update every row where customer_name = 'Alice'. Missing even one row makes the data inconsistent.

Delete Anomaly

Deleting one fact accidentally removes another.

If you delete order 101 (the only order from Alice), you lose Alice’s customer record entirely.

All three anomalies share one root cause: unrelated facts are stored in the same table.

First Normal Form (1NF): Atomic Values

A table is in First Normal Form when:

  1. Every column holds atomic (indivisible) values - no lists, sets, or comma-separated strings in a cell.
  2. Every row is unique (there is a primary key).

Violation Example

order_idcustomerproducts
101AliceKeyboard, Mouse
102BobMonitor

The products column holds multiple values in one cell. This violates 1NF.

1NF Fix: One Value Per Cell

Split the multi-value column into separate rows.

order_idcustomerproduct
101AliceKeyboard
101AliceMouse
102BobMonitor

Now every cell holds one atomic value and you can use (order_id, product) as a composite primary key. The table is in 1NF.

SQL for a 1NF Table

CREATE TABLE order_items (
  order_id   INT,
  customer   VARCHAR(100),
  product    VARCHAR(100),
  PRIMARY KEY (order_id, product)
);

Second Normal Form (2NF): No Partial Dependencies

A table is in Second Normal Form when:

  1. It is already in 1NF.
  2. Every non-key column depends on the entire primary key, not just part of it.

Partial dependency only applies to tables with a composite primary key.

Violation Example

order_idproduct_idproduct_namequantity
101P01Keyboard2
101P02Mouse1
102P01Keyboard3

The composite primary key is (order_id, product_id).

  • quantity depends on both order_id and product_id - correct.
  • product_name depends only on product_id - a partial dependency. This violates 2NF.

If the keyboard’s name changes, you must update every row with product_id = P01.

2NF Fix: Move Partial Dependencies to Their Own Table

order_items (fact about each order line):

order_idproduct_idquantity
101P012
101P021
102P013

products (fact about each product):

product_idproduct_name
P01Keyboard
P02Mouse

Now product_name lives in one place. Updating it updates it everywhere.

SQL for 2NF Tables

CREATE TABLE products (
  product_id   CHAR(3) PRIMARY KEY,
  product_name VARCHAR(100) NOT NULL
);

CREATE TABLE order_items (
  order_id   INT,
  product_id CHAR(3) REFERENCES products(product_id),
  quantity   INT NOT NULL,
  PRIMARY KEY (order_id, product_id)
);

Third Normal Form (3NF): No Transitive Dependencies

A table is in Third Normal Form when:

  1. It is already in 2NF.
  2. Every non-key column depends only on the primary key, not on another non-key column.

A dependency of the form primary_key → column_A → column_B is called a transitive dependency. It violates 3NF because column_B depends on column_A, not directly on the primary key.

Violation Example

employee_idemployee_namedepartment_iddepartment_name
E01AliceD10Engineering
E02BobD20Marketing
E03CarolD10Engineering

The primary key is employee_id.

  • department_id depends on employee_id - correct.
  • department_name depends on department_id, not on employee_id directly - a transitive dependency. This violates 3NF.

If Engineering is renamed to Platform Engineering, every employee row with department_id = D10 must be updated.

3NF Fix: Move Transitive Dependencies to Their Own Table

employees (fact about each employee):

employee_idemployee_namedepartment_id
E01AliceD10
E02BobD20
E03CarolD10

departments (fact about each department):

department_iddepartment_name
D10Engineering
D20Marketing

Now renaming a department requires updating exactly one row.

SQL for 3NF Tables

CREATE TABLE departments (
  department_id   CHAR(3) PRIMARY KEY,
  department_name VARCHAR(100) NOT NULL
);

CREATE TABLE employees (
  employee_id   CHAR(3) PRIMARY KEY,
  employee_name VARCHAR(100) NOT NULL,
  department_id CHAR(3) REFERENCES departments(department_id)
);

BCNF: One Step Further

Boyce-Codd Normal Form (BCNF) is a stricter version of 3NF. A table is in BCNF when every determinant (every column that determines another column’s value) is a candidate key.

Most tables in 3NF are already in BCNF. The edge cases that separate the two are uncommon in practice (they arise from overlapping composite candidate keys). For most application databases, reaching 3NF is sufficient.

When to Denormalize

Normalization is a design best practice, not an absolute rule. There are situations where intentional denormalization makes sense:

Read-Heavy Analytical Workloads

Joining many normalized tables on every query can be expensive at scale. Data warehouses and reporting systems often denormalize into wide flat tables or use pre-aggregated summary tables to reduce join cost.

Caching and Derived Data

Storing a computed field (such as a running total) alongside the source data avoids recalculating it on every read. The tradeoff is that writes must keep the derived column in sync.

Performance Bottlenecks with Clear Measurement

If a profiled query is consistently slow because of a join and you have exhausted indexing options, denormalizing that specific path with careful documentation is a reasonable tradeoff. Do not denormalize speculatively.

The rule of thumb: normalize first, denormalize only where you have measured evidence that the join is the bottleneck.

Normalization at a Glance

Normal FormWhat It EliminatesKey Requirement
1NFRepeating groups, multi-valuesAtomic values; every row uniquely identified
2NFPartial dependenciesNon-key columns depend on the entire composite key
3NFTransitive dependenciesNon-key columns depend only on the primary key
BCNFRemaining functional anomaliesEvery determinant is a candidate key

Interview Questions

1. What is database normalization and why is it important?

Normalization organizes a database to reduce redundancy and prevent update, insert, and delete anomalies. It produces schemas where each fact is stored once and changes propagate without inconsistency.

2. What is the difference between 1NF, 2NF, and 3NF?

1NF requires atomic values and a unique primary key. 2NF additionally requires that every non-key column depends on the full composite primary key (no partial dependencies). 3NF additionally requires that every non-key column depends directly on the primary key with no transitive dependencies through another non-key column.

3. What is a partial dependency?

A partial dependency occurs in a table with a composite primary key when a non-key column depends on only part of that key. It violates 2NF and is fixed by moving the partially dependent column to a separate table keyed by the relevant part of the composite key.

4. What is a transitive dependency?

A transitive dependency occurs when a non-key column depends on another non-key column rather than directly on the primary key. It violates 3NF and is fixed by extracting the transitively dependent columns into their own table.

5. When would you denormalize a database?

Denormalization is appropriate when profiling reveals that join cost is a real bottleneck, typically in read-heavy analytical systems or reporting queries. Always normalize first and denormalize only where you have measured evidence, not as a premature optimization.

6. What is the difference between 3NF and BCNF?

Both eliminate transitive dependencies, but BCNF is stricter: it requires every determinant to be a candidate key. Most 3NF tables are also in BCNF; the distinction matters only in rare schemas with overlapping composite candidate keys.

Conclusion

Database normalization is a foundational design skill. Starting with 1NF eliminates multi-valued cells. Moving to 2NF removes partial dependencies that cause redundant updates. Reaching 3NF eliminates transitive dependencies and produces a schema where each fact lives in exactly one place.

In practice, design to 3NF by default. Introduce denormalization deliberately and only where you have measurement data showing that the normalized design is a genuine bottleneck.

References

  1. Database Normalization: 1NF, 2NF, 3NF & BCNF Examples - DigitalOcean
  2. Database Normalization - Normal Forms 1NF 2NF 3NF Table Examples - freeCodeCamp
  3. Normalization in SQL (1NF-5NF): A Beginner’s Guide - DataCamp

YouTube Videos

  1. “Database Normalization - 1NF, 2NF, 3NF, 4NF” - Decomplexify [https://www.youtube.com/watch?v=GFQaEYEc8_8]

  2. “Database Normalization Explained | 1NF 2NF 3NF BCNF” - Socratica [https://www.youtube.com/watch?v=UrYLYV7WSHM]

  3. “Learn Database Normalization - 1NF, 2NF, 3NF, 4NF, 5NF” - Database Star [https://www.youtube.com/watch?v=J-drts33N8g]


Share this post on:

Next in Series

Continue through the Database Fundamentals with the next recommended article.

Related Posts

Keep Learning with New Posts

Subscribe through RSS and follow the project to get new series updates.

Was this guide helpful?

Share detailed feedback

Previous Post
ACID Properties Explained: Atomicity, Consistency, Isolation, and Durability
Next Post
Basic SQL Queries Every Developer Must Know