Caching Strategies: Make Backends Fast Without Melting the Database

On this page

Your database is doing the same work a thousand times a second
The one principle behind every cache
What a cached request actually looks like
The four patterns, and what each one costs you
Cache-aside in code
The two hard problems: invalidation and stampede
Where caches live
Common mistakes that cost hours
Takeaways
Where to go next

Your database is doing the same work a thousand times a second

Picture a product page. Every visitor hits the same endpoint, which runs the same query: "give me product 42 and its reviews." The product hasn't changed in three weeks. Yet your database recomputes that answer thousands of times a second, joining tables, scanning indexes, serializing rows, only to hand back the exact same bytes it handed back a millisecond ago.

That is wasted work, and it is the number one reason backends fall over under load. The fix is not a bigger database. The fix is to remember the answer so you don't have to ask again. That's caching, and getting it right is one of the highest-leverage skills a backend engineer can build.

Who this is for

Junior-to-mid backend engineers who keep hearing "just add a cache" and want to actually understand the patterns, the trade-offs, and the foot-guns. You should be comfortable with HTTP requests and a database query. No distributed-systems PhD required.

The one principle behind every cache

A cache is a small, fast copy of data kept close to where it's needed, traded against the risk that the copy goes stale.
The entire field of caching, in one sentence

Everything else, the patterns, the TTLs, the invalidation headaches, flows from that single trade. You gain speed; you risk serving something slightly out of date. The whole craft is deciding how much staleness you can tolerate for how much speed.

The basement archive with every file the company ownsYour database, complete, durable, slow to walk to

The handful of folders you keep on your deskThe cache, tiny, instant to grab, holds only what's hot

Walking to the basement when a file isn't on your deskA cache miss, fall back to the database, then bring a copy up

A desk folder going out of date after someone edits the masterStale cache, the hard part: knowing when to throw the copy away

A cache is just keeping the files you use most on your desk.

What a cached request actually looks like

The classic flow is the cache-aside read. Before touching the database, ask the cache. If it has the answer (a hit), serve it. If it doesn't (a miss), fetch from the database, drop a copy in the cache for next time, then serve. Here's that path:

A read with cache-aside: hits skip the database entirely; misses populate the cache on the way back.

1
Look in the cache first
Build a stable key like product:42 and ask the cache for it. This is a single fast network round-trip to Redis, typically under a millisecond.
2
On a hit, serve and stop
If the value is there, deserialize it and return. The database never hears about this request. This is the whole point, the hot path costs almost nothing.
3
On a miss, fall back to the database
The key isn't cached (first request, or it expired). Run the real query against the database to get the authoritative answer.
4
Populate the cache with a TTL
Write the fresh value back into the cache with an expiry (say, 300 seconds) so the next reader gets a hit. The TTL is your safety net against staleness.
5
Return to the client
Serve the answer. The first reader paid the full cost; everyone for the next five minutes rides for free.

The four patterns, and what each one costs you

Reads and writes can each route through the cache in different ways. Four patterns dominate. The split that matters most: does your application code manage the cache (cache-aside), or does the cache layer itself sit in front of the database and manage it for you (read-through / write-through / write-behind)?

Pattern	How it works	Trade-off
Cache-aside	App checks cache; on miss, app queries DB and populates the cache itself. App owns the logic.	Simple and flexible, but every reader duplicates the miss logic, and a write must remember to invalidate.
Read-through	App only talks to the cache; the cache loads from the DB on a miss transparently.	Cleaner app code, but you need a cache that supports a loader, and first-read latency still hits the DB.
Write-through	Writes go to the cache and the DB synchronously, in one operation.	Cache is always fresh, but every write pays both write latencies, slower writes for guaranteed consistency.
Write-behind	Writes hit the cache instantly; the cache flushes to the DB asynchronously later.	Blazing-fast writes, but a crash before flush loses data, and the DB is briefly behind reality.

The four core caching patterns. Choose by how much consistency you need and how much write latency you can spend.

For most read-heavy services, cache-aside is the default, it's explicit, easy to reason about, and the cache failing just means slower requests, not broken ones. Reach for write-through when freshness is non-negotiable, and write-behind only when write throughput is the bottleneck and you can tolerate a small window of risk.

Cache-aside in code

Here's the canonical cache-aside read in Python with Redis. Note the three moving parts: a stable key, a TTL on the write, and serialization (Redis stores bytes, not objects).

products.py

python

import json
import redis

cache = redis.Redis(host="localhost", port=6379, decode_responses=True)
TTL_SECONDS = 300  # 5 minutes, tune to how stale you can tolerate


def get_product(product_id: int) -> dict:
    key = f"product:{product_id}"

    # 1. Ask the cache first
    cached = cache.get(key)
    if cached is not None:
        return json.loads(cached)  # HIT, DB never touched

    # 2. MISS, fall back to the source of truth
    product = db.query_product(product_id)

    # 3. Populate the cache with a TTL so the next reader hits
    cache.set(key, json.dumps(product), ex=TTL_SECONDS)

    return product


def update_product(product_id: int, changes: dict) -> None:
    db.update_product(product_id, changes)
    # Invalidate, don't trust the old copy. Next read re-populates.
    cache.delete(f"product:{product_id}")

The write path is where most bugs live. After updating the database, we delete the cached key rather than trying to overwrite it. Deleting is safer: the next read re-populates from the authoritative source, so you never risk writing a half-built or out-of-order value into the cache.

The two hard problems: invalidation and stampede

There's an old joke that there are only two hard things in computer science: cache invalidation, naming things, and off-by-one errors. The joke is funny because invalidation really is brutal. A cache is a promise that the copy still matches the source, and the moment that promise breaks silently, you're serving wrong data to real users.

Invalidation: knowing when to throw the copy away

You have two tools. TTL (time-to-live) is the lazy, reliable one: every entry self-destructs after N seconds, so staleness is bounded even if you forget everything else. Explicit invalidation is the precise one: when data changes, delete the matching key right then. Real systems use both, a short-ish TTL as a backstop, plus explicit deletes on writes for freshness where it matters.

Stampede: the thundering herd

A cache stampede (or thundering herd) happens when a hot key expires and a thousand concurrent requests all miss at the same instant. They all fall through to the database simultaneously, each running the expensive query the cache was supposed to prevent, and the database, hit with a thousandfold spike, melts. The fix: a short lock so only one request recomputes while the others wait, or staggered TTLs (add a little random jitter) so keys don't all expire on the same tick.

A cache is not a database

Caches are allowed to lose data, that's the deal you signed for speed. Never make correctness depend on a value being in the cache. If Redis restarts and forgets everything, your service should get slower, not wrong. Always design the miss path to produce the correct answer on its own.

Where caches live

"The cache" isn't one place. It's a spectrum from closest-and-smallest to farthest-and-biggest, and serious systems layer several:

In-process, a hash map or LRU inside your app's own memory. Nanosecond-fast, zero network, but each server has its own copy (no sharing) and it dies on restart. Great for tiny, hot, rarely-changing data like feature flags.
Distributed (Redis / Memcached), a separate service all your app servers share. One network hop, survives app restarts, and one invalidation reaches everyone. This is the workhorse for cache-aside. Redis adds data structures and persistence; Memcached is leaner and pure key-value.
CDN / edge, caches whole HTTP responses (images, pages, API results) in data centers near the user. The request never reaches your servers at all on a hit. Ideal for public, cacheable content; controlled with Cache-Control headers.

These compose. A request might check an in-process cache, then Redis, then the database, each layer catching what the one before it missed. The closer the layer, the faster and the smaller. Start with one shared Redis; reach for the others when the numbers tell you to.

Common mistakes that cost hours

No TTL. A cache entry with no expiry can outlive its truth forever. If your invalidation has a single bug, you serve that stale value until the heat death of the universe. Always set a TTL as a backstop, even when you also invalidate explicitly.
Caching everything. Caching data that's read once, or changes every second, just adds a network hop and a consistency risk for zero benefit. Cache what's hot and stable, not what's convenient.
Serving stale data silently. Updating the database but forgetting to invalidate the cache is the classic correctness bug. Every write path must either delete or refresh the matching key, make it part of the write, not an afterthought.
Ignoring the thundering herd. A naive cache works fine in dev and stampedes the database the first time a popular key expires under real traffic. Add a lock or TTL jitter to hot keys before they bite you in production.
Trusting the cache for correctness. Treating a cache as durable storage means a Redis restart becomes a data-loss incident. The miss path must always be able to produce the right answer alone.

Takeaways

Caching in nine lines

A cache trades a small risk of staleness for a large win in speed.
Cache-aside is the sensible default: check cache, miss falls back to DB, then populate.
Read-through hides the loader; write-through keeps the cache fresh; write-behind makes writes fast but riskier.
Always set a TTL, it's your backstop when invalidation has a bug.
On writes, **delete** the key rather than overwriting it; the next read re-populates cleanly.
Cache stampede melts databases, defend hot keys with a lock or TTL jitter.
Caches live in-process, in Redis/Memcached, and at the CDN edge; layer them deliberately.
Cache what's hot and stable, not everything.
Never make correctness depend on the cache, the miss path must stand on its own.

Where to go next

Caching makes reads cheap, but it can't fix a slow query underneath, the first reader and every cache miss still pays that cost. Make the source fast too, and understand how caching fits into scaling the whole system.

Database Indexing & Query Performance, make the miss path itself fast, so a cold cache doesn't hurt.
Scalability Principles, where caching sits among the broader tools for handling load.
Follow the full Backend Engineer track to build these skills in order.

Want to go deeper?

This article covers concepts taught hands-on in the Cloud Engineer and DevOps career paths, with real terminal labs, production scenarios, and structured lessons.

Explore Career Paths Try the Labs

Keep reading

Backend

What Is a Backend Engineer?

Read

Backend

How the Web Works: HTTP Requests

Read

Backend

REST API Design: Clean, Predictable HTTP APIs

Read