Dynamic Programming
How to solve problems that have overlapping subproblems by storing results instead of recomputing them. The technique that turns O(2^n) disasters into O(n) solutions — and how a naive Fibonacci implementation once maxed out a Node.js server's CPU for 8 minutes on a single request.
Why this matters at your level
Add memoisation to recursive solutions. Implement tabulation for Fibonacci, coin change, and climbing stairs. Identify the recurrence relation before writing any code.
Recognise DP patterns: 1D table (Fibonacci, coin change), 2D table (LCS, edit distance, matrix chain). Know when to optimise space from O(n) to O(1) by keeping only the last row or two variables.
Design systems that use DP algorithms at scale: route optimisation, recommendation ranking, sequence alignment in bioinformatics pipelines. Understand when DP problems are NP-hard (0-1 knapsack with fractional amounts) vs polynomial.
Dynamic Programming
How to solve problems that have overlapping subproblems by storing results instead of recomputing them. The technique that turns O(2^n) disasters into O(n) solutions — and how a naive Fibonacci implementation once maxed out a Node.js server's CPU for 8 minutes on a single request.
Client sends POST /calculate with n=50 — naive recursive fib(50) begins
Node.js event loop blocked — all other requests begin timing out as CPU hits 100%
CRITICALfib(50) finally returns after 2^50 recursive calls — process had been unresponsive for 8 minutes
CRITICALMemoisation added: cache stores each fib(n) result, fib(50) now requires 50 unique calls instead of 2^50
fib(50) returns in under 1 millisecond with memoisation — same endpoint, 500 trillion times faster
The question this raises
If a function always returns the same output for the same input, why compute it more than once? What is the systematic way to cache results and avoid recomputation?
You need to compute the 50th Fibonacci number (fib(50) = 12,586,269,025). Your recursive solution works correctly for small numbers but the server hangs when n=50. What is the root cause?
What you'll learn
- Dynamic programming solves problems by breaking them into overlapping subproblems and storing each result so it is never recomputed
- Memoisation (top-down): add a cache to recursive solutions. On each call, check the cache first. If the answer is there, return it immediately without recursing.
- Tabulation (bottom-up): build a table starting from the smallest subproblems, filling up to the answer you need. No recursion, no stack overflow.
- DP is applicable when: (1) the problem has optimal substructure (optimal solution uses optimal solutions to subproblems) and (2) subproblems overlap (same subproblem is solved many times)
- Classic DP problems: Fibonacci, coin change, longest common subsequence, knapsack, edit distance, climbing stairs
- The key question before writing DP: "What is the subproblem?" and "How does the answer to a larger problem relate to smaller subproblems?" (the recurrence relation)
Lesson outline
What this looks like in real life
The staircase problem
You are climbing a staircase. You can take 1 or 2 steps at a time. How many distinct ways can you reach step 10? The insight: to reach step 10, you must have come from step 9 (took 1 step) or step 8 (took 2 steps). So ways(10) = ways(9) + ways(8). Each of those is also split: ways(9) = ways(8) + ways(7). Without DP you recompute ways(8) twice. With DP you compute it once and look it up. The same pattern — subproblem answer reused — appears in every DP problem.
Dynamic Programming
cache = {}
def dp(n):
if n in cache: return cache[n]
cache[n] = ...
return cache[n]Use for: Reach for DP when your recursive solution is slow because the same sub-call appears many times in the call tree. Add a cache dictionary to your recursion (memoisation) and measure the speedup.
The insight that makes it work
The key insight: if a function always returns the same output for the same input, computing it twice is wasted work. DP is the systematic realisation of this: identify every unique subproblem, compute it once, store it, reuse it. The call tree collapses from exponential to linear.
How this concept changes your thinking
I need fib(50) and my recursive solution is slow
“The recursive solution feels correct — fib(n) = fib(n-1) + fib(n-2). I just need a faster machine.”
“A faster machine cannot help O(2^n). But fib(50) calls fib(48) hundreds of billions of times with the same answer every time. I add a cache: first call computes and stores, every subsequent call is a dictionary lookup in O(1).”
I need to find the minimum number of coins to make change for £47
“I try every combination of coins until I find the minimum — exponential search”
“I compute the minimum coins for £1, then £2, ..., up to £47. Each answer uses previously computed answers: coins(£47) = 1 + min(coins(¤47-1), coins(¤47-2), coins(¤47-5), ...).”
I need the longest common subsequence of two strings
“I try every subsequence of the first string and check if it appears in the second — O(2^n)”
“I build a 2D table: LCS(i,j) = length of LCS of first i characters and first j characters. Each cell uses the cell above, to the left, or diagonally. O(n·m) instead of O(2^n).”
Walk through it in plain English
Memoised Fibonacci: fib(5) with a cache
01
Call fib(5). Cache is empty. We need fib(4) + fib(3), so recurse.
02
Call fib(4). Cache is empty. We need fib(3) + fib(2), so recurse.
03
Call fib(3). Cache is empty. We need fib(2) + fib(1), so recurse.
04
Call fib(2). Cache is empty. We need fib(1) + fib(0). Base cases: fib(1) = 1, fib(0) = 0. fib(2) = 1. Store cache[2] = 1. Return 1.
05
Back in fib(3). We have fib(2) = 1 from cache. fib(1) = 1 from base case. fib(3) = 2. Store cache[3] = 2. Return 2.
06
Back in fib(4). We need fib(3) again — cache hit! cache[3] = 2. fib(4) = fib(3) + fib(2) = 2 + 1 = 3. Store cache[4] = 3. Return 3.
07
Back in fib(5). cache[4] = 3, cache[3] = 2. fib(5) = 3 + 2 = 5. Without cache: 15 calls. With cache: 9 calls. At fib(50): 2^50 calls vs 50 calls.
Call fib(5). Cache is empty. We need fib(4) + fib(3), so recurse.
Call fib(4). Cache is empty. We need fib(3) + fib(2), so recurse.
Call fib(3). Cache is empty. We need fib(2) + fib(1), so recurse.
Call fib(2). Cache is empty. We need fib(1) + fib(0). Base cases: fib(1) = 1, fib(0) = 0. fib(2) = 1. Store cache[2] = 1. Return 1.
Back in fib(3). We have fib(2) = 1 from cache. fib(1) = 1 from base case. fib(3) = 2. Store cache[3] = 2. Return 2.
Back in fib(4). We need fib(3) again — cache hit! cache[3] = 2. fib(4) = fib(3) + fib(2) = 2 + 1 = 3. Store cache[4] = 3. Return 3.
Back in fib(5). cache[4] = 3, cache[3] = 2. fib(5) = 3 + 2 = 5. Without cache: 15 calls. With cache: 9 calls. At fib(50): 2^50 calls vs 50 calls.
Now in code
Fibonacci with memoisation, then with tabulation. Both are O(n) time. Tabulation is often preferred in production because it avoids Python's recursion limit and has no function call overhead.
1# Top-down: memoisation (add cache to existing recursion)2def fib_memo(n: int, cache: dict = {}) -> int:3if n <= 1:4return n # base cases: fib(0)=0, fib(1)=15if n in cache:The cache check must come before the recursive calls. This is the entire mechanism: if we already know the answer, skip all the work below and return immediately.6return cache[n] # cache hit: return without recursing7cache[n] = fib_memo(n-1, cache) + fib_memo(n-2, cache)Store in cache before returning — not after. Returning first and storing after means the cache is never populated and every call still recurses to the base case.8return cache[n] # store before returning910# Bottom-up: tabulation (no recursion)11def fib_tab(n: int) -> int:12if n <= 1:13return n14dp = [0] * (n + 1) # table indexed 0..nBottom-up tabulation builds the table from dp[0] and dp[1] upward. There is no recursion and no risk of stack overflow — critical for large n in production.15dp[1] = 1 # base case16for i in range(2, n + 1):Each dp[i] depends only on dp[i-1] and dp[i-2]. You can optimise space from O(n) to O(1) by keeping only two variables instead of the full table.17dp[i] = dp[i-1] + dp[i-2] # build from small to large18return dp[n]
The trap — what most people try first
The naive recursive solution looks correct and is easy to write. It works up to n=30 or so. Above n=40 on most machines it becomes unusably slow. The fix is mechanical: add a cache dictionary.
Naive recursion — O(2^n)
# SLOW: O(2^n) recursive calls
def fib_naive(n):
if n <= 1:
return n
return fib_naive(n-1) + fib_naive(n-2)
# fib(50): ~1 quadrillion calls
# fib(40): ~2 billion calls, takes ~30 seconds
# fib(30): ~1 million calls, feels fine in dev
# The test environment hid the real problem# FAST: O(n) with memoisation
def fib(n, cache={}):
if n <= 1:
return n
if n in cache:
return cache[n] # O(1) lookup
cache[n] = fib(n-1, cache) + fib(n-2, cache)
return cache[n]
# fib(50): 50 unique calls, under 1 millisecond
# fib(1000): still fast (use tabulation to avoid stack overflow)The naive version recomputes fib(48) every time fib(50) and fib(49) both need it — and fib(48) itself recomputes fib(46) twice, and so on. The number of calls doubles with every increment of n: fib(50) makes 2^50 calls. The cache makes each unique value computed exactly once. Adding the 5-line cache converts this from a server-killer to a millisecond operation with zero change to the logic.
When to reach for this
How fast is it?
| Scenario | What happens | Speed |
|---|---|---|
| fib(50) naive recursion | 2^50 calls — over a quadrillion function invocations | Never finishes in practice (O(2^n)) |
| fib(50) with memoisation | 50 unique computations, each O(1) cache lookup after | Instant (O(n)) |
| Coin change: minimum coins for amount A with C coin types | Table of size A × C, each cell uses previous cells | Fast (O(A×C)) |
| Longest common subsequence of two strings of length n and m | n×m table, each cell is O(1) | Fast (O(n×m)) |
Exam Answer vs. Production Reality
Dynamic Programming
📖 What the exam expects
Dynamic programming solves problems with overlapping subproblems by storing results. Memoisation adds a cache to recursion (top-down). Tabulation builds a table from the base case up (bottom-up). Both give O(n) or O(n·m) instead of O(2^n).
Toggle between what certifications teach and what production actually requires
How this might come up in interviews
DP problems in interviews typically ask for the minimum, maximum, or count of ways to do something. The giveaway phrases: "minimum cost", "maximum profit", "number of ways", "can you achieve exactly K". The underlying pattern is always: a larger problem broken into smaller overlapping subproblems where each subproblem needs to be solved only once.
Common questions:
- Climbing stairs (how many ways to reach step n)
- Coin change (minimum coins for amount A)
- House robber (maximum sum with no two adjacent)
- Longest common subsequence
- Edit distance (minimum operations to transform one string to another)
Strong answer: identifies "overlapping subproblems" and "optimal substructure" by name / defines dp[i] precisely before writing a line of code / states the recurrence relation explicitly / handles base cases first / can convert from memoisation to tabulation when asked / mentions space optimisation from O(n) to O(1) for 1D DP
Red flags: writes naive recursion and says "it works" without checking complexity / cannot state the recurrence relation before coding / confuses the subproblem definition / implements memoisation but forgets the base case / uses global mutable state for the cache (breaks in tests)
💡 Analogy
Imagine calculating tips at a restaurant where every table pays a different percentage. If you calculate 15% of £100 for table 3, write it down. When table 7 also orders £100, read the answer from your note instead of calculating again. Dynamic programming is that notepad: the first time you compute something, you write it down. Every subsequent time, you look it up. The notepad costs a little memory but saves enormous computation.
⚡ Core Idea
DP transforms recursion with repeated subproblems into computation where each unique subproblem is solved exactly once. Memoisation adds a cache to existing recursion (top-down). Tabulation builds the answer iteratively from the base case (bottom-up). Both achieve the same time complexity; tabulation avoids stack overflow and is usually faster in practice.
🎯 Why It Matters
DP is the technique that makes AI practical: training neural networks uses DP (backpropagation). Text autocomplete uses DP (edit distance to find nearest words). Bioinformatics uses DP (sequence alignment). Navigation uses DP (shortest path in weighted graphs). Understanding DP is understanding how computers solve problems that would otherwise take longer than the age of the universe.
Ready to see how this works in the cloud?
Switch to Career Paths for structured paths (e.g. Developer, DevOps) and provider-specific lessons.
View role-based pathsSign in to track your progress and mark lessons complete.
Discussion
Questions? Discuss in the community or start a thread below.
Join DiscordIn-app Q&A
Sign in to start or join a thread.