Page 4: Large Objects and File Organization Methods

Handling Large Objects (BLOBs and CLOBs)

Problem: What if you want to store a video file (500 MB) in a database where block size is 8 KB?

Solution:

Store the large object in a separate file
Store only a pointer in the regular record

Example:

Student Record:
- ID: 12345
- Name: "John Doe"  
- Photo: pointer_to_photo_file_location

Types of Large Objects:

BLOB: Binary Large Object (images, videos, audio)
CLOB: Character Large Object (large text documents)

File Organization Methods

1. Heap File Organization

Concept: Records can be placed anywhere in the file where space is available.

Example: Like throwing clothes into a closet randomly

File: [Record 5][Record 2][Record 8][Record 1][Record 3]

Characteristics:

✅ Simple to implement
✅ Fast insertions
❌ Slow searches (must check every record)
❌ No ordering

2. Sequential File Organization

Concept: Records are stored in sorted order based on a search key.

Example: Like organizing books alphabetically by title

Sorted by ID:
[ID:10101][ID:12121][ID:15151][ID:22222][ID:32343]

Benefits:

✅ Fast for range queries ("Find all students with IDs 12000-15000")
✅ Efficient sequential processing
❌ Insertions are complex (need to maintain order)

Insertion Challenge: When inserting ID:13500 between 12121 and 15151:

If space available in same block → insert directly
If no space → use overflow block and adjust pointers

3. Hash File Organization

Concept: Use a mathematical function to determine where to store each record.

Example:

Hash function: ID % 4 (remainder when divided by 4)
ID 10101 → 10101 % 4 = 1 → Store in bucket 1
ID 12121 → 12121 % 4 = 1 → Store in bucket 1  
ID 15151 → 15151 % 4 = 3 → Store in bucket 3

Benefits:

✅ Very fast for exact matches
❌ Poor for range queries

4. Multitable Clustering

Concept: Store related records from different tables in the same block.

Example: Store student record with their enrollment records

Block 1: [Student: John] [His Course: Math] [His Course: Physics]
Block 2: [Student: Mary] [Her Course: Chemistry] [Her Course: Biology]

Benefits:

✅ Fast joins (related data already together)
❌ Complex storage management