Database Operations: Join vs Cartesian Product

Natural Join Operation (⋈)

Combines relations by matching tuples on common attributes.

Example:

R(A, B) = {(1, 2), (2, 3), (3, 4)}
S(B, C) = {(2, 5), (3, 6), (5, 7)}

R ⋈ S = {(1, 2, 5), (2, 3, 6)}

Only matching B values are joined.

Cartesian Product (×)

Combines every tuple from one relation with every tuple from another.

Example:

R(A, B) = {(1, 2), (3, 4)}
S(C, D) = {(5, 6), (7, 8)}

R × S = {(1, 2, 5, 6), (1, 2, 7, 8), (3, 4, 5, 6), (3, 4, 7, 8)}

Lossy vs Lossless Decomposition

Bad Decomposition Example: Original: employee(ID, name, street, city, salary)

(57766, Kim, Main, Perryridge, 75000)
(98776, Kim, North, Hampton, 67000)

Decomposed into:

  • employee1(ID, name): {(57766, Kim), (98776, Kim)}

  • employee2(name, street, city, salary): {(Kim, Main, Perryridge, 75000), (Kim, North, Hampton, 67000)}

Problem: Joining back creates spurious tuples:

(57766, Kim, Main, Perryridge, 75000)  ✓ Original
(57766, Kim, North, Hampton, 67000)    ✗ Spurious
(98776, Kim, Main, Perryridge, 75000)  ✗ Spurious  
(98776, Kim, North, Hampton, 67000)    ✓ Original

Lesson: Always decompose using proper keys to avoid data loss.

Updated on