Page 5: Introduction to Indexing

What is an Index?

An index is like a book's table of contents - it helps you find information quickly without reading the entire book.

Without Index: To find all Physics department instructors

  1. Read record 1 → Check department → Not Physics

  2. Read record 2 → Check department → Not Physics

  3. Continue until end of file...

With Index:

  1. Look up "Physics" in index → Points to records 3, 5, 9

  2. Read only those specific records

Types of Indices

1. Ordered Indices

Records are arranged in sorted order by search key.

Example: Index on instructor salary

Index:        Data:
40000 →       Mozart, Music, 40000
60000 →       El Said, History, 60000  
65000 →       Srinivasan, Comp. Sci., 65000

2. Hash Indices

Use hash function to locate records quickly.

Example:

Hash function: Department name → Bucket number
"Physics" → Bucket 2
"Math" → Bucket 5

Primary vs Secondary Index

Primary Index (Clustering):

  • Data file is physically ordered by the index key

  • Example: Student file sorted by student ID, with index on student ID

Secondary Index (Non-clustering):

  • Data file ordered differently than index key

  • Example: Student file sorted by ID, but index created on student name

Dense vs Sparse Indices

Dense Index: Has an entry for every search key value

Index entries:
10101 → points to record with ID 10101
12121 → points to record with ID 12121  
15151 → points to record with ID 15151
... (entry for every record)

Sparse Index: Has entries only for some search key values (only works with clustering)

Index entries:
10101 → points to first record of block containing IDs 10101-15000
20001 → points to first record of block containing IDs 20001-25000
... (one entry per block)

Finding a record with sparse index: To find ID 12121:

  1. Look in index for largest entry ≤ 12121 → Find 10101

  2. Go to that block and scan sequentially until you find 12121

Multilevel Indices

When the index itself becomes too large to fit in memory.

Example:

Level 2 Index:    Level 1 Index:        Data:
A-G →            Adams →               Adams record
H-M →            Baker →               Baker record  
N-Z →            Chen →                Chen record
                 ...
                 Miller →              Miller record
                 ...
                 Zhang →               Zhang record
Updated on