Back to path
MediumWorking system ~14h· 5 milestones

Design a scalable system end-to-end with ADRs

You’re asked to design something with real scale and real failure modes, a feed, chat, or notification system, and to document the decisions like a senior engineer would.

Distributed systemsSharding/partitioningConsistency modelsCaching strategyADRsFailure analysisObservability & SLOsCost modelingRunbooks

What you'll build

A full design for a scalable system covering data partitioning, consistency, caching, and failure modes, documented as Architecture Decision Records, with a prototype of the riskiest part.

See how we teach, before you sign up

You don't just get code dumped on you. Every starter file and every solution is explained line-by-line, in plain English. Here's one real file from this project:

DESIGN.mdmarkdown
# Scalable System, Design

## Problem statement
(one paragraph: the system + the central tension it must resolve)

## Scale assumptions
- Users / DAU:
- Hardest mechanism (fan-out / ordering / hotspot):
- Load on that mechanism (worst case, not average):

## Partitioning
- Partition key:
- Hotspot argument (why this key spreads evenly):

## Consistency (per data path)
| path        | model            | why            |
|-------------|------------------|----------------|
| <write X>   | strong/eventual  |                |

## Caching
- What / eviction / sized against read estimate:
- Invalidation:

## Failure modes
| component | failure | mitigation |
|-----------|---------|------------|

## SLOs (see milestone 3)
## Decisions: see adr/

Reading this file

  • ## Problem statementWrite the central tension first, a design without a hard sub-problem is decision-free and boring.
  • Hardest mechanism (fan-out / ordering / hotspot)Naming the single thing that makes the system hard focuses the whole design on what actually matters.
  • Hotspot argument (why this key spreads evenly)A partition key only helps if it spreads load, so you must argue it will not create a hot shard.
  • ## Consistency (per data path)Different paths need different guarantees, deciding strong vs eventual per path avoids hidden bugs.
  • ## Failure modesDistributed systems are defined by how they fail, so every component needs a failure and a mitigation.

Top-level doc. Write the problem statement and scale assumptions before any boxes.

That's 1 of 9 explained code blocks in this single project.

The build, milestone by milestone

  1. 1

    Pick the hard problem

    5 guided steps

    A system without a genuinely hard sub-problem produces a boring, decision-free design. Picking the right problem is what makes the rest worth doing.

  2. 2

    Design for scale & failure

    5 guided steps

    Distributed systems are defined by their failure modes. A design that only describes the happy path is incomplete, the senior signal is naming what breaks and how you contain it.

  3. 3

    Make it observable & cost it

    5 guided steps

    A scalable design that can’t be operated cheaply or debugged at 3am is half a design. SLOs, a cost number, and a runbook are the difference between a diagram and a system someone can actually run.

  4. 4

    Record decisions

    5 guided steps

    Decisions without recorded rationale get re-litigated forever and can’t be safely revisited. ADRs are how senior engineers make their reasoning durable and reviewable.

  5. 5

    Prototype the risk

    5 guided steps

    A design is a hypothesis. Prototyping the one assumption you’re least sure of is how you find out you’re wrong on paper instead of in production.

What's inside when you start

4 starter files, ready to clone
5 guided milestones
5 full reference solutions
9 code blocks explained line-by-line
5 "is it working?" checks
4 interview questions it prepares you for

You'll walk away with

A design doc with scaling, consistency, caching, and failure analysis
An SLO + observability plan (signals, sources, dashboard sketch) and a $/month cost model
A one-page runbook for the most-likely incident (symptom → confirm → mitigate → rollback)
A set of ADRs documenting each major decision and its consequences
A focused prototype of the riskiest assumption, with a measurement and verdict

This is portfolio-grade. Build it free.

Sign up to unlock every milestone step-by-step, the code skeletons, full reference solutions, and checkable tasks, with your progress saved as you build.

Start building