Handling Large Objects (BLOBs and CLOBs)
Problem: What if you want to store a video file (500 MB) in a database where block size is 8 KB?
Solution:
-
Store the large object in a separate file
-
Store only a pointer in the regular record
Example:
Student Record:
- ID: 12345
- Name: "John Doe"
- Photo: pointer_to_photo_file_location
Types of Large Objects:
-
BLOB: Binary Large Object (images, videos, audio)
-
CLOB: Character Large Object (large text documents)
File Organization Methods
1. Heap File Organization
Concept: Records can be placed anywhere in the file where space is available.
Example: Like throwing clothes into a closet randomly
File: [Record 5][Record 2][Record 8][Record 1][Record 3]
Characteristics:
-
✅ Simple to implement
-
✅ Fast insertions
-
❌ Slow searches (must check every record)
-
❌ No ordering
2. Sequential File Organization
Concept: Records are stored in sorted order based on a search key.
Example: Like organizing books alphabetically by title
Sorted by ID:
[ID:10101][ID:12121][ID:15151][ID:22222][ID:32343]
Benefits:
-
✅ Fast for range queries ("Find all students with IDs 12000-15000")
-
✅ Efficient sequential processing
-
❌ Insertions are complex (need to maintain order)
Insertion Challenge: When inserting ID:13500 between 12121 and 15151:
-
If space available in same block → insert directly
-
If no space → use overflow block and adjust pointers
3. Hash File Organization
Concept: Use a mathematical function to determine where to store each record.
Example:
Hash function: ID % 4 (remainder when divided by 4)
ID 10101 → 10101 % 4 = 1 → Store in bucket 1
ID 12121 → 12121 % 4 = 1 → Store in bucket 1
ID 15151 → 15151 % 4 = 3 → Store in bucket 3
Benefits:
-
✅ Very fast for exact matches
-
❌ Poor for range queries
4. Multitable Clustering
Concept: Store related records from different tables in the same block.
Example: Store student record with their enrollment records
Block 1: [Student: John] [His Course: Math] [His Course: Physics]
Block 2: [Student: Mary] [Her Course: Chemistry] [Her Course: Biology]
Benefits:
-
✅ Fast joins (related data already together)
-
❌ Complex storage management