Database Design and Normalization

  • ID: DBSQL-F-L05
  • Type: Lesson
  • Audience: Public
  • Theme: Integrity, redundancy, and schema discipline

Schema design determines whether queries remain reliable.

Normalization reduces redundancy and prevents update, insert, and delete anomalies.

This lesson follows standard database design treatments Date (2004); Silberschatz, Korth, and Sudarshan (2019).


Setup


Inspect Table Structure

SQLite provides schema inspection through PRAGMA.

df = pd.read_sql("PRAGMA table_info(customers);", con)
print(df)
   cid         name  type  notnull dflt_value  pk
0    0  customer_id  TEXT        0       None   1
1    1    full_name  TEXT        1       None   0
2    2         city  TEXT        1       None   0
3    3      country  TEXT        1       None   0
4    4  signup_date  DATE        1       None   0
df = pd.read_sql("PRAGMA table_info(orders);", con)
print(df)
   cid         name  type  notnull dflt_value  pk
0    0     order_id  TEXT        0       None   1
1    1  customer_id  TEXT        1       None   0
2    2   order_date  DATE        1       None   0
3    3       status  TEXT        1       None   0

Why Normalization Helps

If customer information were repeated in every order row, a city update would require changing many rows.

Separating customers from orders prevents redundancy.

In this schema:

  • customers stores identity and profile attributes
  • orders stores transaction events
  • order_items stores line items
  • products stores product attributes

The goal is one entity per table and relationships enforced through keys.


Constraints Protect Integrity

Constraints prevent invalid data at the database level.

df = pd.read_sql("SELECT sql FROM sqlite_master WHERE name = 'products';", con)
print(df)
                                                 sql
0  CREATE TABLE products (\n  product_id   TEXT P...

Look for:

  • PRIMARY KEY
  • NOT NULL
  • CHECK constraints

Design discipline reduces analytical fragility.