Interactive Explainer

Relevant for:Mid-levelSenior

Why this matters at your level

Mid-level

Use heapq for top-K and priority queue problems. Implement a basic Trie with insert, search, and startsWith. Know when each data structure applies and why it is faster than the naive alternative.

Senior

Design autocomplete systems with precomputed top-K at each trie node. Know how tries are used in network routing (longest prefix match). Understand heap-based merge of K sorted arrays and its use in external sorting and distributed merge.

Heaps & Tries

Two specialised data structures that solve specific problems orders of magnitude faster than general alternatives. Heaps keep the minimum or maximum always accessible in O(1). Tries navigate prefix-based searches in O(key length). When a startup built autocomplete using a trie with O(n) traversal per keystroke, typing felt like moving through quicksand — and the feature was removed after three days.

~5 min read

Be the first to complete!

LIVEFeature Removed — Autocomplete Trie — O(n) Traversal per Keystroke

Breaking News

Day 1

Autocomplete launched — trie built correctly but traversal is O(n) per keystroke for popular prefixes

Day 1, first users

800ms lag between keystrokes — search bar visibly trails the user's typing

CRITICAL

Day 2

Analytics show users abandoning search after 1-2 keystrokes — engagement drops to near zero

CRITICAL

Day 3

Feature removed from production — users routed to browse navigation instead

WARNING

Post-mortem

Fix identified: store top-K results at each trie node during indexing. Lookup becomes O(prefix length), not O(subtree size)

—per keystroke in autocomplete — from O(n) full subtree traversal instead of O(prefix length)

—before the feature was removed — the right data structure used in the wrong way

The question this raises

Using the right data structure is not enough — you also need to use it in a way that takes advantage of its structure. What is the correct way to use a trie for autocomplete?

Test your assumption first

You are building a search autocomplete. As the user types each letter, you want to show the top 10 suggestions. You built the trie correctly. Which approach gives the best performance per keystroke?

What you'll learn

A heap is a binary tree where the root is always the minimum (min-heap) or maximum (max-heap) — O(1) peek, O(log n) insert and remove
Use a heap for: top-K elements, median of a stream, Dijkstra shortest path (priority queue), task scheduling by priority
A trie (prefix tree) stores strings so that common prefixes share nodes — O(key length) insert and lookup, independent of how many keys are stored
Use a trie for: autocomplete, spell checking, IP routing tables, word search in a dictionary
The critical autocomplete optimisation: store precomputed top-K results at each trie node during indexing, so lookup is O(prefix length) not O(subtree size)
Python: use heapq module (min-heap). For max-heap, negate values. For top-K largest, maintain a min-heap of size K.

Lesson outline

What this looks like in real life

Heap: the always-ready top-10

Imagine tracking the 10 most expensive orders in a stream of millions. With a sorted list: each new order triggers a re-sort O(n log n). With a min-heap of size 10: compare the new order to the smallest of the current top-10 (the heap root). If the new order is larger, swap it in and re-heap in O(log 10) = O(1). The minimum of your top-10 is always at the root, ready to be compared and replaced. Processing a million orders: O(n log K) where K=10.

Heap (Priority Queue)

import heapq
heap = []
heapq.heappush(heap, item)
heapq.heappop(heap)  # removes minimum

Use for: Reach for a heap when you need the minimum or maximum of a dynamic set, especially the top-K elements, or when you need to repeatedly extract the smallest or largest item

The insight that makes it work

The key insight for heaps: you do not need full sorting to get the minimum or maximum — just maintain the heap property. Full sorting is O(n log n). Maintaining a heap where the minimum is always at the root is O(log n) per operation. If you only ever need the min or max, paying for a full sort is wasted work.

How this concept changes your thinking

Situation

Before

After

I need to find the 10 most popular search terms from a stream of 10 million queries

“I collect all 10 million, sort by frequency (O(n log n)), and take the top 10”

“I use a min-heap of size 10. For each new query, if its frequency exceeds the minimum in my heap, I swap it in (O(log 10)). After 10 million queries: O(n log K) where K=10 — almost O(n).”

I need autocomplete suggestions for a search box with 500,000 indexed words

“On each keystroke I scan all words starting with the typed prefix — O(n) per character”

“I build a trie where each node stores the top-10 precomputed suggestions for its prefix. Lookup is O(prefix length) — typing "pro" traverses 3 nodes and returns the answer in 3 steps, not 500,000.”

I need to find a word in a dictionary of 100,000 entries

“Hash table: O(1) average for exact match only”

“Trie: O(word length) for exact match AND prefix search AND "does any word start with this prefix". The trie handles all three queries; the hash table handles only the first.”

Walk through it in plain English

Trie insert and lookup: words "cat", "car", "card", "care"

→

Start with an empty root node. The root has no character — it is the entry point.

→

Insert "cat". From root, follow (or create) the node for "c". From "c", follow "a". From "a", follow "t". Mark "t" as end-of-word. Nodes: root → c → a → t*.

→

Insert "car". From root, "c" exists. From "c", "a" exists. From "a", "r" is new — create it. Mark "r" as end-of-word. Now "a" has two children: "t" and "r".

→

Insert "card". From root, "c"→"a"→"r" all exist. "d" is new — create it. Mark "d" as end-of-word. "r" now has one child: "d".

→

Insert "care". From root, "c"→"a"→"r" exist. "e" is new — create it. Mark "e" as end-of-word. "r" now has two children: "d" and "e".

→

Lookup "car". From root, follow "c"→"a"→"r". "r" is marked end-of-word. Found. Total: 3 steps regardless of how many other words are in the trie.

Prefix search "car". Same traversal: "c"→"a"→"r". From "r", collect all end-of-word descendants: "car", "card", "care". The subtree rooted at "r" contains exactly the words with prefix "car".

Start with an empty root node. The root has no character — it is the entry point.

Insert "cat". From root, follow (or create) the node for "c". From "c", follow "a". From "a", follow "t". Mark "t" as end-of-word. Nodes: root → c → a → t*.

Insert "car". From root, "c" exists. From "c", "a" exists. From "a", "r" is new — create it. Mark "r" as end-of-word. Now "a" has two children: "t" and "r".

Insert "card". From root, "c"→"a"→"r" all exist. "d" is new — create it. Mark "d" as end-of-word. "r" now has one child: "d".

Insert "care". From root, "c"→"a"→"r" exist. "e" is new — create it. Mark "e" as end-of-word. "r" now has two children: "d" and "e".

Lookup "car". From root, follow "c"→"a"→"r". "r" is marked end-of-word. Found. Total: 3 steps regardless of how many other words are in the trie.

Prefix search "car". Same traversal: "c"→"a"→"r". From "r", collect all end-of-word descendants: "car", "card", "care". The subtree rooted at "r" contains exactly the words with prefix "car".

Now in code

Heap for top-K elements (Python heapq) and a basic Trie implementation. These are the two patterns that appear most in interviews.

heaps_tries.py

1import heapq
2 
3# HEAP: find top-K largest numbers
4def top_k_largest(nums: list, k: int) -> list:
5    min_heap = []                          # min-heap of size k
A min-heap of size K naturally tracks the K largest elements: any element larger than the heap minimum (the Kth largest seen so far) displaces it. The minimum of the heap is always the Kth largest.
6    for num in nums:
7        heapq.heappush(min_heap, num)      # push each number
8        if len(min_heap) > k:
Pop AFTER pushing, not before. Push first ensures the new element is considered. If it is too small, popping removes it immediately. The heap always holds exactly K elements.
9            heapq.heappop(min_heap)        # remove smallest if over k
10    return sorted(min_heap, reverse=True)  # result is top-k largest
11 
12# TRIE: insert and search
13class TrieNode:
14    def __init__(self):
Each TrieNode has a children dict (not a fixed-size array). This handles any alphabet without pre-allocating 26 slots. For a 26-character alphabet, an array is faster but uses more memory.
15        self.children = {}                 # char -> TrieNode
16        self.is_end = False                # marks end of a word
17 
18class Trie:
19    def __init__(self):
20        self.root = TrieNode()
21 
22    def insert(self, word: str):
23        node = self.root
24        for char in word:
is_end marks where a word terminates. Without it, "car" and "card" would be indistinguishable — traversing to "r" would find the same node whether you inserted "car" or only "card".
25            if char not in node.children:
26                node.children[char] = TrieNode()
27            node = node.children[char]
28        node.is_end = True                 # mark end of word
29 
30    def search(self, word: str) -> bool:
31        node = self.root
32        for char in word:
33            if char not in node.children:
34                return False
35            node = node.children[char]
36        return node.is_end

The trap — what most people try first

For top-K problems the naive approach sorts everything. For autocomplete the naive approach scans all words per prefix. Both are correct but use O(n log n) and O(n) respectively where heaps and tries give O(n log K) and O(prefix length).

Sorting for top-K instead of using a heap

Bug

# SLOW: sort all elements to find top-K
def top_k_naive(nums, k):
    nums.sort(reverse=True)  # O(n log n)
    return nums[:k]          # O(k)
# Total: O(n log n) for any K
# At n=10,000,000 and K=10: sorts 10M elements
# to find 10. A heap does it in O(n log 10) = O(n).

Fix

# FAST: maintain heap of size K
import heapq

def top_k_heap(nums, k):
    return heapq.nlargest(k, nums)  # O(n log k)
    # Or manually:
    # heap = []
    # for n in nums:
    #     heapq.heappush(heap, n)
    #     if len(heap) > k:
    #         heapq.heappop(heap)
    # return sorted(heap, reverse=True)

The naive approach sorts all n elements to find K of them. For K=10 and n=10,000,000, you sort 10 million elements to answer "what are the biggest 10?" The heap approach maintains only K elements at any time, processing each new element in O(log K). For K=10, log K = ~3.3, so each element costs about 3 operations. The speedup is proportional to log(n)/log(K) — enormous when K is small.

When to reach for this

Do you need repeated access to the minimum or maximum of a dynamic set?

YesUse a heap — O(1) peek, O(log n) insert and remove

NoContinue...

Do you need prefix-based search or autocomplete on strings?

YesUse a trie — O(key length) insert and lookup, prefix search built in

NoContinue...

Do you need exact-match string lookup only (no prefix operations needed)?

YesUse a hash table — O(1) average, simpler than a trie for exact match only

NoUse a sorted array or BST for range queries or ordered traversal

How fast is it?

Scenario	What happens	Speed
Find minimum in a heap	Always at root — no traversal needed	Instant (O(1))
Insert into a heap of n elements	Add at bottom, bubble up to correct position	Fast (O(log n))
Top-K from n elements using a heap	Maintain heap of size K, process each element once	Fast (O(n log K))
Insert or lookup in a trie (word length m)	Follow m nodes from root, create if missing	Fast (O(m))
Autocomplete with stored top-K at each trie node	Follow prefix length nodes, read precomputed result	Very fast (O(prefix length))
Autocomplete with full subtree scan per keystroke	Traverse entire subtree of current prefix node	Slow (O(subtree size))

Exam Answer vs. Production Reality

1 / 1

Heaps & Tries

📖 What the exam expects

A heap is a complete binary tree satisfying the heap property: every parent is smaller (min-heap) or larger (max-heap) than its children. O(1) min/max access, O(log n) insert and delete. A trie stores strings character by character; common prefixes share nodes. O(m) insert and lookup where m is the key length, independent of dictionary size.

Toggle between what certifications teach and what production actually requires

How this might come up in interviews

Heap problems: "find the Kth largest element", "merge K sorted lists", "find the median of a data stream", "task scheduler". Trie problems: "implement autocomplete", "word search in a dictionary", "replace words with roots", "design a search autocomplete system". Heaps and tries almost never appear in the same problem — identify which one applies and why.

Common questions:

Kth largest element in an array
Find median from a data stream (two heaps)
Merge K sorted lists
Implement trie (insert, search, startsWith)
Design search autocomplete system

Strong answer: immediately identifies top-K as a heap problem and justifies the O(n log K) complexity / knows to negate values for max-heap in Python / implements trie with children dict not a fixed 26-element array / mentions storing precomputed top-K in trie nodes for autocomplete without being asked

Red flags: uses sorting for top-K problems (O(n log n) when O(n log K) is achievable) / builds trie but uses full subtree traversal per prefix query / confuses min-heap and max-heap (Python heapq is min-heap by default) / does not know heapq in Python or PriorityQueue in Java

🧠Mental Model

💡 Analogy

A heap is like a tournament bracket where the winner (minimum or maximum) is always at the top and is immediately readable. Inserting a new player puts them at the bottom and they bubble up to their correct position. Removing the winner re-seeds the bracket in O(log n). A trie is like an autocomplete keyboard: each key you press narrows the tree to a smaller subtree. Typing "pr" takes you to a branch containing only words starting with "pr". Typing "pro" narrows further. The tree does the navigation for you based on each character.

⚡ Core Idea

A heap maintains the heap property: every parent is smaller (min-heap) or larger (max-heap) than its children. This guarantees O(1) access to the minimum or maximum at all times. A trie stores strings by their characters, one character per level. All strings sharing a prefix share the same nodes down to where they diverge. This makes prefix lookup O(key length) regardless of how many strings are stored.

🎯 Why It Matters

Heaps power every priority queue in computing: job schedulers, Dijkstra's algorithm, merge of sorted lists, finding the median in a stream of numbers. Tries power every search box you have ever used: Google's autocomplete, IDE code completion, router prefix matching, dictionary spell-check. These two structures appear in the background of almost every interactive system.

Ready to see how this works in the cloud?

Switch to Career Paths for structured paths (e.g. Developer, DevOps) and provider-specific lessons.

View role-based paths

Discussion

Questions? Discuss in the community or start a thread below.

Join Discord