DIZNR INTERNATIONAL

GATE CSEIT/Database/Collision in Hashing Hashing collision.

GATE CSEIT/Database/Collision in Hashing Hashing collision.

https://www.gyanodhan.com/video/7B6.%20GATE%20CSEIT/Database/659.%20Collision%20in%20Hashing%20%20Hashing%20collision.mp4

Hashing and Collision Handling in Databases (GATE CSE/IT)

 What is Hashing?

Hashing is a technique used in database indexing & data structures to store and retrieve data efficiently. It uses a hash function to convert keys into a fixed-size address (hash value).

Formula for Hash Function:

h(K)=Kmod  Mh(K) = K \mod M

where:

Example: If we insert key 25 into a hash table of size 10 using h(K) = K mod 10, we get:

h(25)=25mod  10=5h(25) = 25 \mod 10 = 5

So, 25 is stored in index 5.

 What is Collision in Hashing?

A collision occurs when two keys produce the same hash value.
For example, if 25 and 35 both hash to index 5, we have a collision.

Why Collisions Occur?

  1. Limited number of slots in the hash table.
  2. Different keys hashing to the same index.
  3. Poor hash function choice.

 Collision Handling Techniques

 Open Addressing (Probing Techniques)

The key is placed in an alternative location if a collision occurs.

(a) Linear Probing

 If a collision occurs, check the next available slot (i.e., h(K)+ih(K) + i).
Example: If 25 and 35 both hash to index 5, store 35 in 6.
Issue: Clustering (many keys get stored together).

Formula:

h(K,i)=(h(K)+i)mod  Mh(K, i) = (h(K) + i) \mod M

(b) Quadratic Probing

 Checks the next available slot in a quadratic sequence ().
Example: If 25 hashes to 5, the next checked slots are (5+1², 5+2², 5+3², …).
Reduces clustering but may lead to secondary clustering.

Formula:

h(K,i)=(h(K)+i2)mod  Mh(K, i) = (h(K) + i^2) \mod M

(c) Double Hashing

 Uses two hash functions to reduce clustering.
Example: If h1(K) = 5, use a second function (h2(K)) to determine the next slot.

Formula:

h(K,i)=(h1(K)+i×h2(K))mod  Mh(K, i) = (h1(K) + i \times h2(K)) \mod M

 Separate Chaining (Linked List in Buckets)

 Each slot in the hash table stores a linked list of values.
 If multiple keys hash to the same index, they are stored in a linked list at that index.
Example: If 25 and 35 both hash to 5, they are stored in a linked list at index 5.

Advantage: Efficient for large hash tables.
Disadvantage: Increased memory usage due to linked lists.

 Choosing the Best Collision Resolution Technique

Method Pros Cons
Linear Probing Simple, easy to implement Causes clustering
Quadratic Probing Reduces clustering May not find empty slots
Double Hashing Avoids clustering Slightly complex
Separate Chaining Handles collisions well Requires extra memory

 Example GATE Question (Hashing Collision Handling)

Question: A hash function is defined as h(K) = K mod 10. Given keys 23, 43, 13, 27, 33, insert them using linear probing in a hash table of size 10.

Solution:
Step 1: Compute hash values.

Key Hash Value (h(K) = K mod 10) Placement (Using Linear Probing)
23 23 mod 10 = 3 Slot 3
43 43 mod 10 = 3 Collision → Next slot 4
13 13 mod 10 = 3 Collision → Next slot 5
27 27 mod 10 = 7 Slot 7
33 33 mod 10 = 3 Collision → Next slot 6

Final Hash Table:

Index: 0 1 2 3 4 5 6 7 8 9
Keys: - - - 23 43 13 33 27 - -

 Conclusion

Hashing is essential for fast data access in databases.
Collisions occur when multiple keys hash to the same index.
Various techniques like open addressing & separate chaining help resolve collisions.
Choosing the right method depends on efficiency, memory, and clustering avoidance.

Would you like more examples or Python code implementations?

Here’s a focused explanation on Collisions in Hashing, tailored for GATE CSE/IT – Databases:


Collision in Hashing

What is a Collision?

A collision occurs when two different keys hash to the same index in the hash table.

Let h(k1) = h(k2) but k1 ≠ k2 → this is a collision.

Hash collisions are inevitable due to the pigeonhole principle when:


Why Collisions Happen


Collision Resolution Techniques

To handle collisions, we use collision resolution strategies. There are two main categories:


1. Open Hashing (Separate Chaining)

Example:
If h(13) = h(27) = 6, both keys are stored in a linked list at index 6.

Pros:

Cons:


2. Closed Hashing (Open Addressing)

All elements are stored directly in the hash table. If a collision occurs, the algorithm searches for another open slot.

a) Linear Probing

h(k, i) = (h(k) + i) mod m

b) Quadratic Probing

h(k, i) = (h(k) + c1*i + c2*i²) mod m

c) Double Hashing

h(k, i) = (h1(k) + i * h2(k)) mod m

Pros:

Cons:


Load Factor (α)

α = n / m

For open addressing, performance drops sharply as α → 1.


Example (GATE Level)

Keys: 10, 22, 31, 4, 15, 28, 17, 88, 59
Hash function: h(k) = k mod 9

Using Linear Probing, fill a table of size 9:

lua
h(10) = 1
h(22) = 4
h(31) = 4 → collision → next slot → 5
h(4) = 4 → collision → next free slot → 6
...

This is a typical GATE-style question.


GATE Tips


Would you like a PDF summary, MCQ practice, or a visual diagram of collision resolution?

GATE CSEIT/Database/Collision in Hashing Hashing collision.

Introduction to Programming and Data Structures – Hashing

GATEONLINE CLASSES ON DATA STRUCTURES