How senior engineers approach architecture from scratch: requirements → scale estimation → components → bottlenecks → trade-offs.
How senior engineers approach architecture from scratch: requirements → scale estimation → components → bottlenecks → trade-offs.
Lesson outline
A junior engineer asked "Design a URL shortener" immediately starts drawing boxes: "We need a database, an API server, and a cache."
A senior engineer asks questions first: "How many URLs per day? Read-heavy or write-heavy? Do short links expire? Analytics required? Global or regional?" Only after understanding the constraints do they propose components.
System design thinking is not a list of components. It is a process of reasoning under uncertainty. The output is not the "right answer" — it is a defensible design whose trade-offs you can explain.
The RECE framework
Requirements → Estimation → Components → Evaluation. A structured approach to any system design problem: clarify requirements, estimate scale, design components to meet that scale, then evaluate bottlenecks and trade-offs.
Most design failures come from building the wrong thing, not from building the thing wrong. Requirements clarification is not a formality — it is where you discover constraints that change the entire design.
The four categories of requirements to clarify
Scale estimation tells you what class of problem you are solving. It determines whether you need a single database or a globally distributed cluster.
The numbers every engineer should know
Latency (approximate): - L1 cache: 0.5ns | L2 cache: 7ns | RAM: 100ns - SSD random read: 100µs | HDD seek: 10ms - Network: same DC = 0.5ms | cross-region = 30-150ms Bandwidth: - SSD throughput: 500MB/s | HDD: 100MB/s - 1Gbps network: 125MB/s | 10Gbps: 1.25GB/s Storage: - 1M users × 1KB profile = 1GB - 1B images × 300KB average = 300TB - 86,400 seconds/day | 31M seconds/month
Back-of-envelope estimation for a social media app (100M DAU)
01
Reads: 100M DAU × 20 timeline refreshes/day = 2B read requests/day = ~23,000 reads/second
02
Writes: 100M DAU × 2 posts/day = 200M writes/day = ~2,300 writes/second
03
Write:read ratio = 1:10 → read-heavy, cache is critical
04
Storage: 200M posts/day × 280 chars × 2 bytes = ~112GB/day → 40TB/year of text alone
05
Bandwidth: 23,000 reads/second × 10KB average timeline payload = 230MB/s read bandwidth needed
06
Conclusion: multiple database replicas needed, read cache mandatory, CDN for media, write-through or write-behind cache for hot timelines
Reads: 100M DAU × 20 timeline refreshes/day = 2B read requests/day = ~23,000 reads/second
Writes: 100M DAU × 2 posts/day = 200M writes/day = ~2,300 writes/second
Write:read ratio = 1:10 → read-heavy, cache is critical
Storage: 200M posts/day × 280 chars × 2 bytes = ~112GB/day → 40TB/year of text alone
Bandwidth: 23,000 reads/second × 10KB average timeline payload = 230MB/s read bandwidth needed
Conclusion: multiple database replicas needed, read cache mandatory, CDN for media, write-through or write-behind cache for hot timelines
After requirements and estimation, design the components to meet those constraints. Start simple, then layer on complexity where the numbers demand it.
Standard architectural components and when you need each
A system has 5,000 reads/second and 500 writes/second. Which component should you prioritize adding first?
Every design has bottlenecks and trade-offs. The ability to identify and articulate them is what separates a senior engineer from a junior engineer in a design review.
| Component | Common bottleneck | Solution | Trade-off created |
|---|---|---|---|
| Database | Write throughput exceeds single-node capacity | Sharding by user ID | Cross-shard queries become expensive; joins across shards are impossible |
| Cache | Cache invalidation bugs (stale data) | TTL + explicit invalidation on writes | Adds complexity; eventual consistency window (brief stale reads) |
| Fan-out (Twitter model) | Celebrity with 50M followers: 50M writes on each tweet | Pull model for celebrities (compute timeline on read) | Higher read latency for users who follow celebrities |
| Consistency vs availability | Network partition splits primary and replica | Choose: reject writes (CP) or allow divergence (AP) | CAP theorem — you can only choose two of three for any distributed system |
| Session management | Sticky sessions prevent autoscaling | Externalize sessions to Redis | Adds Redis as a dependency; single point of failure if Redis is not HA |
The bottleneck-first mindset
After designing a system, ask: "Where is the first thing that breaks if traffic doubles?" Then double it again: "What breaks next?" This iterative bottleneck identification is how real systems are designed — not by predicting all future problems, but by solving the current limiting factor.
System design interviews at every level above junior. Used at FAANG and major tech companies to assess architectural reasoning. The process matters as much as the solution.
Common questions:
Key takeaways
What is the first question to ask when approaching any system design problem?
Clarify requirements — both functional (what it does) and non-functional (availability, latency, consistency). The constraints define the design; designing without them is guessing.
A system has a 1:100 write:read ratio. What is the first architectural component to consider?
A caching layer (Redis/Memcached) to absorb the overwhelming read traffic. With 1:100 write:read, most reads can be served from cache, dramatically reducing database load without adding complexity.
Ready to see how this works in the cloud?
Switch to Career Paths for structured paths (e.g. Developer, DevOps) and provider-specific lessons.
View role-based pathsSign in to track your progress and mark lessons complete.
Questions? Discuss in the community or start a thread below.
Join DiscordSign in to start or join a thread.