AAD- Optimal Binary Search Tree With Practical Example ( for Deeper Understanding) Part 25.
AAD- Optimal Binary Search Tree With Practical Example ( for Deeper Understanding) Part 25.
Contents [hide]
- 0.1 Optimal Binary Search Tree (OBST) – Advanced Analysis & Practical Example
- 0.2 2. Why Do We Need OBST?
- 0.3 3. Problem Definition
- 0.4 4. Dynamic Programming Approach for OBST
- 0.5 5. Practical Example
- 0.6 6. Constructing the Optimal BST
- 0.7 7. Applications of OBST
- 0.8 8. Conclusion
- 0.9 AAD- Optimal Binary Search Tree With Practical Example ( for Deeper Understanding) Part 25.
- 0.10 Optimal Decision Trees via Dynamic Programming and …
- 0.11 Optimal binary search trees
- 0.12 Optimal Binary Search Tree
- 0.13
AAD – Optimal Binary Search Tree (OBST)
- 1
What is an Optimal Binary Search Tree?
- 2
Why OBST?
- 3
Problem Statement
- 4
Dynamic Programming Solution
- 5
Practical Example
- 6
OBST Tree Construction (Visual Representation)
- 7
Real-world Use Case
- 8
Key Takeaways
Optimal Binary Search Tree (OBST) – Advanced Analysis & Practical Example
1. Introduction to Optimal Binary Search Tree (OBST)
An Optimal Binary Search Tree (OBST) is a specialized Binary Search Tree (BST) that is structured in a way that minimizes the expected search cost (or total weighted path length).
When searching for elements in a BST, the cost depends on how frequently each element is accessed. Instead of constructing a standard BST, OBST minimizes the average search time based on given probabilities of searching for keys.
2. Why Do We Need OBST?
A regular BST does not always guarantee an optimal search time. If the tree is unbalanced, the search cost may be high.
- OBST ensures that frequently accessed elements are placed closer to the root, reducing search cost on average.
- This is useful in database indexing, compiler design (syntax trees), and information retrieval systems.
3. Problem Definition
Given n keys K1,K2,…,KnK_1, K_2, …, K_n in sorted order, and their respective probabilities of access:
- Search probabilities: p1,p2,…,pnp_1, p_2, …, p_n (probability of searching each key)
- Dummy keys’ probabilities: q0,q1,…,qnq_0, q_1, …, q_n (probability of searching between keys or missing elements)
The goal is to construct a binary search tree that minimizes the expected search cost:
C(T)=∑i=1n(Depth(Ki)×Probability(Ki))C(T) = \sum_{i=1}^{n} (Depth(K_i) \times Probability(K_i))
where Depth(K_i) is the level of node KiK_i in the tree.
4. Dynamic Programming Approach for OBST
We define:
- e[i][j]e[i][j] → Expected search cost for subtree containing keys KiK_i to KjK_j
- w[i][j]w[i][j] → Sum of all probabilities from KiK_i to KjK_j
- root[i][j]root[i][j] → Root of the OBST for KiK_i to KjK_j
Recurrence Relation:
e[i][j]=minr∈[i,j](e[i][r−1]+e[r+1][j]+w[i][j])e[i][j] = \min_{r \in [i,j]} (e[i][r-1] + e[r+1][j] + w[i][j])
where rr is the root node considered at that step.
5. Practical Example
Given Data:
Keys: K1, K2, K3
Sorted Keys: (10, 20, 30)
Key | K1 (10) | K2 (20) | K3 (30) |
---|---|---|---|
Probability pp | 0.3 | 0.4 | 0.3 |
Dummy qq | 0.1 | 0.1 | 0.1 |
Step 1: Compute Weight Matrix w[i][j]w[i][j]
w[i][j]=pi+pi+1+…+pj+qi−1+qjw[i][j] = p_i + p_{i+1} + … + p_j + q_{i-1} + q_j
i → j | w[i][i] | w[i][i+1] | w[i][i+2] |
---|---|---|---|
w[1][1]w[1][1] = 0.4 | w[1][2]w[1][2] = 0.9 | w[1][3]w[1][3] = 1.3 | |
w[2][2]w[2][2] = 0.5 | w[2][3]w[2][3] = 0.9 | ||
w[3][3]w[3][3] = 0.4 |
Step 2: Compute Expected Cost e[i][j]e[i][j]
Base Case:
e[i][i]=w[i][i]e[i][i] = w[i][i]
For multiple elements, choose rr (root) that minimizes:
e[i][j]=e[i][r−1]+e[r+1][j]+w[i][j]e[i][j] = e[i][r-1] + e[r+1][j] + w[i][j]
i → j | e[i][i] | e[i][i+1] | e[i][i+2] |
---|---|---|---|
e[1][1]e[1][1] = 0.4 | e[1][2]e[1][2] = 0.9 (root = K1) | e[1][3]e[1][3] = 1.3 (root = K2) | |
e[2][2]e[2][2] = 0.5 | e[2][3]e[2][3] = 0.9 (root = K2) | ||
e[3][3]e[3][3] = 0.4 |
6. Constructing the Optimal BST
From the table:
- K2 (20) becomes the root (it minimizes expected cost).
- K1 (10) is the left child.
- K3 (30) is the right child.
Final Optimal BST Structure:
7. Applications of OBST
Database Indexing (e.g., indexing frequently accessed records efficiently).
Compiler Design (syntax trees in parsers).
Artificial Intelligence (decision trees in machine learning).
8. Conclusion
The Optimal Binary Search Tree (OBST) ensures efficient searching by minimizing the expected search cost using dynamic programming. This is especially useful in applications where search frequencies vary significantly.
Would you like a code implementation of OBST in C++/Python?
AAD- Optimal Binary Search Tree With Practical Example ( for Deeper Understanding) Part 25.
Optimal Decision Trees via Dynamic Programming and …
Optimal binary search trees
Optimal Binary Search Tree
AAD – Optimal Binary Search Tree (OBST)
Part 25: With Practical Example for Deeper Understanding
(AAD = Advanced Algorithm Design – relevant to GATE CSE/IT)
What is an Optimal Binary Search Tree?
An Optimal Binary Search Tree (OBST) is a binary search tree built from a set of sorted keys, such that:
- Search cost is minimized based on the frequency/probability of accessing each key.
- Frequently accessed keys are placed closer to the root.
- Used when search frequencies are known in advance (e.g., in compilers, symbol tables, etc.).
Why OBST?
In a normal BST, the structure depends only on the order of insertion.
But in OBST, we want to minimize the weighted search time, i.e.:
Expected Cost=∑i=1nfi⋅depth(ki)\text{Expected Cost} = \sum_{i=1}^{n} f_i \cdot \text{depth}(k_i)
Where:
- fif_i = frequency of accessing key kik_i
Problem Statement
Given:
- A sorted list of keys K={k1,k2,…,kn}K = \{k_1, k_2, …, k_n\}
- Their corresponding search probabilities P={p1,p2,…,pn}P = \{p_1, p_2, …, p_n\}
- And dummy keys (for unsuccessful searches) with probabilities Q={q0,q1,…,qn}Q = \{q_0, q_1, …, q_n\}
Construct an OBST that minimizes the expected cost of searching.
Dynamic Programming Solution
We build two tables:
- Cost[i][j]: Minimum cost of a BST that can be formed from kik_i to kjk_j
- Root[i][j]: Root of the subtree containing kik_i to kjk_j
Practical Example
Let’s say we have 3 keys:
Key | Probability (p) |
---|---|
A (k₁) | 0.15 |
B (k₂) | 0.10 |
C (k₃) | 0.05 |
Dummy keys: q0=0.05,q1=0.10,q2=0.05,q3=0.05q_0 = 0.05, q_1 = 0.10, q_2 = 0.05, q_3 = 0.05
Step 1: Initialize Tables
We use 0-based indexing for cost and root tables:
- Cost[i][j] for i > j = qiq_i (base case)
- Start filling lengths from 1 to n (bottom-up DP)
Step 2: Fill Cost and Root Tables
We use the recurrence:
Cost[i][j]=minr=ij{Cost[i][r−1]+Cost[r+1][j]+W(i,j)}Cost[i][j] = \min_{r=i}^{j} \{ Cost[i][r-1] + Cost[r+1][j] + W(i,j) \}
Where:
W(i,j)=∑l=ijpl+∑l=i−1jqlW(i, j) = \sum_{l=i}^{j} p_l + \sum_{l=i-1}^{j} q_l
This is the total weight (probability mass) from i to j.
Step 3: Final Result
After filling the DP tables, the value Cost[1][n] gives the minimum search cost, and you can reconstruct the tree using the Root[i][j] table.
OBST Tree Construction (Visual Representation)
For this example, the optimal root may be key B, with A as left child and C as right child — this gives the lowest weighted cost.
B
/ \
A C
Real-world Use Case
- Compilers/Symbol Tables – frequently accessed variables or function names are accessed quickly.
- Database Indexing – when access pattern is known, OBST gives faster search than regular BST.
- Predictive text/input systems – keys with higher selection probability are placed closer to root.
Key Takeaways
Concept | Summary |
---|---|
OBST | BST optimized for search cost |
Goal | Minimize expected search time |
Approach | Dynamic Programming |
Time Complexity | O(n3)O(n^3), can be optimized to O(n2)O(n^2) |
Application | Databases, Compilers, AI trees |
Would you like a step-by-step Python implementation of OBST with this example or want to try a GATE-style question?