Binary Trees & BST
The hierarchical structure behind every file system, every database index, and every decision tree. When a production B-tree degenerated to a linked list, a 1ms database query became 45 seconds.
Why this matters at your level
Write inorder, preorder, postorder traversals recursively. Know the BST property (left < node < right). Implement level-order traversal using a queue.
Validate a BST correctly using propagated bounds. Implement lowest common ancestor. Know why sequential inserts degrade BST performance and how self-balancing trees prevent it.
Connect BST to database indexes: understand why B-trees use wide nodes, why UUID v1 degrades database indexes, and when to use CLUSTER, REINDEX, or UUID v4 in production.
Binary Trees & BST
The hierarchical structure behind every file system, every database index, and every decision tree. When a production B-tree degenerated to a linked list, a 1ms database query became 45 seconds.
App launches with UUID v1 primary keys — sequential UUIDs always insert to rightmost B-tree leaf
Index becomes right-skewed. Query planner begins occasionally choosing sequential scans over the index.
WARNING10M rows. Query planner consistently chooses sequential scan. User-facing query time: 45 seconds.
CRITICALEngineer runs EXPLAIN ANALYZE — discovers sequential scan on 10M rows instead of index scan.
WARNINGFix: switch to UUID v4 (random) for new keys + REINDEX to rebuild balanced tree. Query time: back to 1ms.
The question this raises
If a binary search tree that is not kept balanced degrades to O(n) for everything — why does balance matter and what keeps real database indexes balanced?
A balanced binary search tree holds 1 million items. In the best case, how many comparisons do you need to find a specific value?
What you'll learn
- A binary tree: each node has at most two children (left and right). A BST adds a property: all left descendants < node < all right descendants.
- BST search, insert, delete are O(h) where h is tree height. For a balanced tree: h = O(log n). For a degenerate tree: h = O(n).
- Inorder traversal of a BST produces elements in sorted order — this is the key property used in database range queries.
- Self-balancing trees (AVL, Red-Black) maintain O(log n) height automatically by rotating nodes after insert/delete.
- Database B-trees are m-ary (not binary) — wider nodes mean fewer levels and fewer disk reads per lookup.
- Floyd's cycle detection (two pointers) applies to trees via iterative traversal — convert recursion to iteration to avoid stack overflow on deep trees.
Lesson outline
What this looks like in real life
The library that stops working
Imagine a library where every new book is placed to the RIGHT of all existing books (because its ISBN is higher — like a sequential UUID). After a year, all 100,000 books are in a single rightward chain. To find any book, you walk the entire chain. The "binary search tree" is now a linked list. That is what happened to the startup's PostgreSQL index: 8 months of sequential UUID inserts created a right-skewed tree the query planner refused to use.
Binary Search Tree
BST property: left.value < node.value < right.value height h determines all complexities: O(h)
Use for: Reach for this when you need O(log n) insert, delete, and search with maintained sorted order — database indexes, sorted sets, symbol tables
The insight that makes it work
The key insight: the BST property turns search into elimination. At every node, you can eliminate one entire subtree based on a single comparison. If you are looking for 42 and the current node is 60, you eliminate everything in the right subtree — every value there is larger than 60, which is already larger than 42. You only go left. At the next node, you eliminate again. This is O(log n) ONLY if the tree is balanced.
How this concept changes your thinking
Searching for value 42 in a BST of 1 million nodes
“I would traverse every node until I find 42. That could be 1 million comparisons.”
“The BST property lets me eliminate half the tree at each step. In a balanced tree, I make at most log₂(1,000,000) ≈ 20 comparisons. Balance is the precondition.”
Inserting sequential values 1, 2, 3, 4, 5 into a BST
“I insert them in order. Each new value goes to the right of the previous. The tree balances itself, right?”
“No — a basic BST does not self-balance. Sequential inserts create a right-skewed tree that looks exactly like a linked list. Every search now takes O(n). Self-balancing trees (AVL, Red-Black) rotate nodes to maintain balance after each insert.”
Writing a recursive tree traversal function
“I write it recursively — trees are naturally recursive, so this is the clearest way.”
“For very deep trees (10,000+ levels), recursion will stack overflow. The iterative version uses an explicit stack. For interview problems, recursive is fine. For production systems with user-supplied data, always consider depth limits.”
Walk through it in plain English
Inorder traversal of a BST with values [4, 2, 6, 1, 3, 5, 7]
01
The BST looks like this: root=4, left subtree has 2 (with children 1 and 3), right subtree has 6 (with children 5 and 7). The BST property means left < parent < right at every level.
02
Inorder traversal visits: left subtree, then current node, then right subtree. Applied recursively to every node, this produces all values in sorted order.
03
Visit left subtree of 4: go to node 2. Its left subtree is node 1 (no children). Visit 1. Then visit 2. Then visit 3 (right child of 2). Left subtree of root done: output [1, 2, 3].
04
Visit root (4): output 4. Running output: [1, 2, 3, 4].
05
Visit right subtree of 4: go to node 6. Its left subtree is 5 (no children). Visit 5. Then visit 6. Then visit 7. Running output: [1, 2, 3, 4, 5, 6, 7].
06
The inorder traversal produced [1, 2, 3, 4, 5, 6, 7] — the values in sorted order. This always works for a BST. It is why database indexes can perform range queries: walk the inorder sequence between the lower and upper bounds.
The BST looks like this: root=4, left subtree has 2 (with children 1 and 3), right subtree has 6 (with children 5 and 7). The BST property means left < parent < right at every level.
Inorder traversal visits: left subtree, then current node, then right subtree. Applied recursively to every node, this produces all values in sorted order.
Visit left subtree of 4: go to node 2. Its left subtree is node 1 (no children). Visit 1. Then visit 2. Then visit 3 (right child of 2). Left subtree of root done: output [1, 2, 3].
Visit root (4): output 4. Running output: [1, 2, 3, 4].
Visit right subtree of 4: go to node 6. Its left subtree is 5 (no children). Visit 5. Then visit 6. Then visit 7. Running output: [1, 2, 3, 4, 5, 6, 7].
The inorder traversal produced [1, 2, 3, 4, 5, 6, 7] — the values in sorted order. This always works for a BST. It is why database indexes can perform range queries: walk the inorder sequence between the lower and upper bounds.
Now in code
Inorder traversal and BST search — the two most fundamental tree operations:
1// Inorder traversal: produces sorted output from BST2function inorder(node, result = []) {3if (node === null) return result; // base caseBase case: null node means we have gone past a leaf — return without adding anything.4inorder(node.left, result); // left subtree firstLeft subtree FIRST — this ensures smaller values appear before the current node in the output.5result.push(node.val); // current nodeCurrent node is added BETWEEN left and right — that is what "inorder" means.6inorder(node.right, result); // right subtree last7return result;8}910// BST search: O(h) — h is tree height11function searchBST(root, target) {12if (root === null) return null; // not found13if (root.val === target) return root; // foundBST property: if target is smaller than current node, it can only be in the LEFT subtree — we eliminate the entire right subtree in one comparison.14if (target < root.val) {15return searchBST(root.left, target); // go leftIf target is larger, it can only be in the RIGHT subtree. Each comparison halves the remaining candidates — O(log n) for balanced trees.16}17return searchBST(root.right, target); // go right18}
The trap — what most people try first
Validating a BST looks simple but has a common off-by-one error: checking only the parent-child relationship is not enough.
Wrong BST validation (checks only parent-child) vs correct (propagates bounds)
// WRONG — only checks direct parent-child
// Misses: a node's value must be less than
// ALL ancestors, not just its parent
function isValidBST_wrong(node) {
if (!node) return true;
const leftOk = !node.left ||
node.left.val < node.val;
const rightOk = !node.right ||
node.right.val > node.val;
return leftOk && rightOk &&
isValidBST_wrong(node.left) &&
isValidBST_wrong(node.right);
}// CORRECT — propagates min/max bounds
function isValidBST(node, min=-Infinity, max=Infinity) {
if (!node) return true;
if (node.val <= min || node.val >= max) {
return false;
}
return isValidBST(node.left, min, node.val) &&
isValidBST(node.right, node.val, max);
}The wrong version checks: is my left child smaller than me? Is my right child larger? This catches obvious violations but misses subtler ones. A node in the left subtree must be less than ALL of its ancestors — not just its direct parent. The correct version passes min and max bounds down the tree. Going left narrows the maximum (the child must be less than the parent). Going right narrows the minimum (the child must be greater than the parent). Every node is checked against the full range of valid values it could take.
When to reach for this
Which tree operation do I need?
How fast is it?
| Operation | Balanced BST | Degenerate BST (sorted inserts) |
|---|---|---|
| Search for a value | Very fast (O(log n)) | Slow (O(n)) — like a linked list |
| Insert a value | Very fast (O(log n)) | Slow (O(n)) — find insertion point first |
| Delete a value | Very fast (O(log n)) | Slow (O(n)) |
| Inorder traversal (all nodes) | Fast (O(n)) — visits every node once | Fast (O(n)) — same, just a long line |
| Find minimum or maximum | Fast (O(log n)) — go all left or all right | Slow (O(n)) — for degenerate tree |
Exam Answer vs. Production Reality
BST vs sorted array
📖 What the exam expects
Both BST and sorted array achieve O(log n) search. BST additionally achieves O(log n) insert and delete (for balanced trees). Sorted array requires O(n) for insert/delete due to shifting elements.
Toggle between what certifications teach and what production actually requires
How this might come up in interviews
Tree problems dominate FAANG interviews. Core operations: inorder traversal, level-order BFS, lowest common ancestor, path sum, maximum depth. The harder problems test whether you recognise a tree problem in disguise: validate BST, serialize/deserialize a tree, construct a tree from two traversals.
Common questions:
- Inorder traversal (recursive and iterative)
- Maximum depth of a binary tree
- Validate a BST (with bounds propagation)
- Lowest common ancestor in a BST and a general binary tree
- Level-order traversal (BFS) returning each level as a separate list
Strong answer: states the traversal type (inorder/preorder/postorder/level-order) and WHY before writing a single line / handles null node base case as the first thing in the function / knows BST property: ALL left descendants < current < ALL right descendants / connects BST to database indexes when asked about real-world applications
Red flags: confuses binary tree (any tree with ≤2 children) with BST (left < node < right) / validates BST only by checking parent-child, not propagating bounds / does not handle the null node base case immediately / cannot name the traversal order and explain WHY that order is needed
💡 Analogy
A BST is like a well-organized library where each book points to the section of books that come before it (left) and the section that come after it (right). To find any book, start at the front desk (root), ask "does my book come before or after this one?", and go left or right. If the library is balanced, you check at most log₂(total books) sections. If all books were shelved in a single row on the right — a degenerate BST — you check every section one by one. That is the difference between O(log n) and O(n) in a production database.
⚡ Core Idea
The BST property (left < node < right at every node) enables binary search through the tree: each comparison eliminates half the remaining candidates. This only works if the tree is balanced — if one subtree is much deeper than the other, you eliminate less than half each step and the guarantee breaks down. Self-balancing trees (AVL, Red-Black, B-tree) maintain balance automatically after each insert or delete.
🎯 Why It Matters
Database indexes are B-trees. File systems use B-trees. If your data is inserted in sorted order (monotonically increasing UUIDs, timestamps, auto-increment IDs), the tree degenerates toward O(n). Knowing this tells you: use random UUIDs or let the database handle ID generation with built-in balancing.
Ready to see how this works in the cloud?
Switch to Career Paths for structured paths (e.g. Developer, DevOps) and provider-specific lessons.
View role-based pathsSign in to track your progress and mark lessons complete.
Discussion
Questions? Discuss in the community or start a thread below.
Join DiscordIn-app Q&A
Sign in to start or join a thread.