AAD- Optimal Binary Search Tree With Practical Example ( for Deeper Understanding) Part 25.

AAD- Optimal Binary Search Tree With Practical Example ( for Deeper Understanding) Part 25.

play-rounded-fill play-rounded-outline play-sharp-fill play-sharp-outline
pause-sharp-outline pause-sharp-fill pause-rounded-outline pause-rounded-fill
00:00

Optimal Binary Search Tree (OBST) – Advanced Analysis & Practical Example

1. Introduction to Optimal Binary Search Tree (OBST)

An Optimal Binary Search Tree (OBST) is a specialized Binary Search Tree (BST) that is structured in a way that minimizes the expected search cost (or total weighted path length).



When searching for elements in a BST, the cost depends on how frequently each element is accessed. Instead of constructing a standard BST, OBST minimizes the average search time based on given probabilities of searching for keys.

2. Why Do We Need OBST?

A regular BST does not always guarantee an optimal search time. If the tree is unbalanced, the search cost may be high.

  • OBST ensures that frequently accessed elements are placed closer to the root, reducing search cost on average.
  • This is useful in database indexing, compiler design (syntax trees), and information retrieval systems.

3. Problem Definition

Given n keys K1,K2,…,KnK_1, K_2, …, K_n in sorted order, and their respective probabilities of access:

  • Search probabilities: p1,p2,…,pnp_1, p_2, …, p_n (probability of searching each key)
  • Dummy keys’ probabilities: q0,q1,…,qnq_0, q_1, …, q_n (probability of searching between keys or missing elements)

The goal is to construct a binary search tree that minimizes the expected search cost:

C(T)=∑i=1n(Depth(Ki)×Probability(Ki))C(T) = \sum_{i=1}^{n} (Depth(K_i) \times Probability(K_i))

where Depth(K_i) is the level of node KiK_i in the tree.

4. Dynamic Programming Approach for OBST

We define:

  • e[i][j]e[i][j] → Expected search cost for subtree containing keys KiK_i to KjK_j
  • w[i][j]w[i][j] → Sum of all probabilities from KiK_i to KjK_j
  • root[i][j]root[i][j] → Root of the OBST for KiK_i to KjK_j

Recurrence Relation:

e[i][j]=min⁡r∈[i,j](e[i][r−1]+e[r+1][j]+w[i][j])e[i][j] = \min_{r \in [i,j]} (e[i][r-1] + e[r+1][j] + w[i][j])

where rr is the root node considered at that step.

5. Practical Example

Given Data:

Keys: K1, K2, K3
Sorted Keys: (10, 20, 30)

Key K1 (10) K2 (20) K3 (30)
Probability pp 0.3 0.4 0.3
Dummy qq 0.1 0.1 0.1

Step 1: Compute Weight Matrix w[i][j]w[i][j]

w[i][j]=pi+pi+1+…+pj+qi−1+qjw[i][j] = p_i + p_{i+1} + … + p_j + q_{i-1} + q_j

i → j w[i][i] w[i][i+1] w[i][i+2]
w[1][1]w[1][1] = 0.4 w[1][2]w[1][2] = 0.9 w[1][3]w[1][3] = 1.3
w[2][2]w[2][2] = 0.5 w[2][3]w[2][3] = 0.9
w[3][3]w[3][3] = 0.4

Step 2: Compute Expected Cost e[i][j]e[i][j]

Base Case:

e[i][i]=w[i][i]e[i][i] = w[i][i]

For multiple elements, choose rr (root) that minimizes:

e[i][j]=e[i][r−1]+e[r+1][j]+w[i][j]e[i][j] = e[i][r-1] + e[r+1][j] + w[i][j]

i → j e[i][i] e[i][i+1] e[i][i+2]
e[1][1]e[1][1] = 0.4 e[1][2]e[1][2] = 0.9 (root = K1) e[1][3]e[1][3] = 1.3 (root = K2)
e[2][2]e[2][2] = 0.5 e[2][3]e[2][3] = 0.9 (root = K2)
e[3][3]e[3][3] = 0.4

6. Constructing the Optimal BST

From the table:

  • K2 (20) becomes the root (it minimizes expected cost).
  • K1 (10) is the left child.
  • K3 (30) is the right child.

Final Optimal BST Structure:

markdown
20
/ \
10 30

7. Applications of OBST

✔ Database Indexing (e.g., indexing frequently accessed records efficiently).
✔ Compiler Design (syntax trees in parsers).
✔ Artificial Intelligence (decision trees in machine learning).

8. Conclusion

The Optimal Binary Search Tree (OBST) ensures efficient searching by minimizing the expected search cost using dynamic programming. This is especially useful in applications where search frequencies vary significantly.

Would you like a code implementation of OBST in C++/Python?

AAD- Optimal Binary Search Tree With Practical Example ( for Deeper Understanding) Part 25.

Optimal Decision Trees via Dynamic Programming and …

Optimal binary search trees

Optimal Binary Search Tree

✅ AAD – Optimal Binary Search Tree (OBST)

Part 25: With Practical Example for Deeper Understanding
(AAD = Advanced Algorithm Design – relevant to GATE CSE/IT)


📘 What is an Optimal Binary Search Tree?

An Optimal Binary Search Tree (OBST) is a binary search tree built from a set of sorted keys, such that:

  • Search cost is minimized based on the frequency/probability of accessing each key.
  • Frequently accessed keys are placed closer to the root.
  • Used when search frequencies are known in advance (e.g., in compilers, symbol tables, etc.).

🔹 Why OBST?

In a normal BST, the structure depends only on the order of insertion.
But in OBST, we want to minimize the weighted search time, i.e.:

Expected Cost=∑i=1nfi⋅depth(ki)\text{Expected Cost} = \sum_{i=1}^{n} f_i \cdot \text{depth}(k_i)

Where:

  • fif_i = frequency of accessing key kik_i

🧠 Problem Statement

Given:

  • A sorted list of keys K={k1,k2,…,kn}K = \{k_1, k_2, …, k_n\}
  • Their corresponding search probabilities P={p1,p2,…,pn}P = \{p_1, p_2, …, p_n\}
  • And dummy keys (for unsuccessful searches) with probabilities Q={q0,q1,…,qn}Q = \{q_0, q_1, …, q_n\}

Construct an OBST that minimizes the expected cost of searching.


🔧 Dynamic Programming Solution

We build two tables:

  • Cost[i][j]: Minimum cost of a BST that can be formed from kik_i to kjk_j
  • Root[i][j]: Root of the subtree containing kik_i to kjk_j

✅ Practical Example

Let’s say we have 3 keys:

Key Probability (p)
A (k₁) 0.15
B (k₂) 0.10
C (k₃) 0.05

Dummy keys: q0=0.05,q1=0.10,q2=0.05,q3=0.05q_0 = 0.05, q_1 = 0.10, q_2 = 0.05, q_3 = 0.05

Step 1: Initialize Tables

We use 0-based indexing for cost and root tables:

  • Cost[i][j] for i > j = qiq_i (base case)
  • Start filling lengths from 1 to n (bottom-up DP)

Step 2: Fill Cost and Root Tables

We use the recurrence:

Cost[i][j]=min⁡r=ij{Cost[i][r−1]+Cost[r+1][j]+W(i,j)}Cost[i][j] = \min_{r=i}^{j} \{ Cost[i][r-1] + Cost[r+1][j] + W(i,j) \}

Where:

W(i,j)=∑l=ijpl+∑l=i−1jqlW(i, j) = \sum_{l=i}^{j} p_l + \sum_{l=i-1}^{j} q_l

This is the total weight (probability mass) from i to j.


Step 3: Final Result

After filling the DP tables, the value Cost[1][n] gives the minimum search cost, and you can reconstruct the tree using the Root[i][j] table.


🌳 OBST Tree Construction (Visual Representation)

For this example, the optimal root may be key B, with A as left child and C as right child — this gives the lowest weighted cost.

        B
       / \
      A   C

🔍 Real-world Use Case

  • Compilers/Symbol Tables – frequently accessed variables or function names are accessed quickly.
  • Database Indexing – when access pattern is known, OBST gives faster search than regular BST.
  • Predictive text/input systems – keys with higher selection probability are placed closer to root.

💡 Key Takeaways

Concept Summary
OBST BST optimized for search cost
Goal Minimize expected search time
Approach Dynamic Programming
Time Complexity O(n3)O(n^3), can be optimized to O(n2)O(n^2)
Application Databases, Compilers, AI trees

Would you like a step-by-step Python implementation of OBST with this example or want to try a GATE-style question?

AAD- Optimal Binary Search Tree With Practical Example ( for Deeper Understanding) Part 25.

Optimal Binary Search Trees



Leave a Reply

Your email address will not be published. Required fields are marked *

error: