The speed of your feedback loop determines the speed of your learning — and your systems. Every engineering practice that works is a feedback loop optimisation.
Know that a failing CI is a broken feedback loop for the entire team. Keep CI green. Run linters and type checks locally before pushing. A flaky test is not "probably fine" — it degrades the signal for everyone.
Actively measure your team's feedback loop latencies: how long does CI take, how long from merge to production, how quickly do you detect incidents? If CI takes 45 minutes, it is your job to compress it. Know your team's DORA metrics.
Design systems where monitoring is architecturally independent from what it monitors. Architect for observability as a first-class property. Champion DORA metric tracking and use the data to justify engineering investment. Set explicit MTTD and lead time targets.
Set org-level feedback loop standards: CI must be under X minutes, MTTD under Y minutes, deployment frequency at least Z. Use DORA as a lagging indicator of feedback loop health across teams. Identify where monitoring is architecturally coupled to the systems it monitors and drive independence.
The speed of your feedback loop determines the speed of your learning — and your systems. Every engineering practice that works is a feedback loop optimisation.
BGP config update withdraws all Facebook routes — internal and external — from the internet
CRITICALFacebook.com, Instagram, WhatsApp offline for 3B users. Monitoring, Workplace, and badge-access systems simultaneously unreachable
CRITICALEngineers identify the issue remotely but cannot access systems. Physical data-centre access required — door badges also offline
WARNINGServices restore after 6 hours as engineers manually apply BGP fix on-site
The question this raises
If your monitoring, alerting, and incident-response tools all live inside the systems they are supposed to monitor, what happens to your ability to see and fix problems when those systems fail?
Lesson outline
A feedback loop is any cycle where the output of a system is used as an input to adjust the system's next action. Engineering is entirely made of them: your compiler tells you if the code is valid. Your tests tell you if the logic is correct. Your monitoring tells you if the system is healthy. Your users tell you if the feature is useful.
The engineering principle: the speed of your feedback loop determines the speed of your learning. A feedback loop that takes 45 minutes to tell you your code is broken means 45 minutes of compounding wrong assumptions. A feedback loop that takes 78 days to tell you a CVE is in your dependencies (Equifax) means 78 days of exposure.
The missing leg: monitoring your monitoring
Facebook's October 2021 BGP outage lasted 6 hours not because the bug was hard to fix, but because the engineers' monitoring, communication tools, and physical door-badge access all ran on the same internal network that went down. The feedback loop that would have enabled the fix was part of what broke. Your system's health is only as visible as the architectural independence of your monitoring from the system being monitored.
Every software delivery pipeline contains four distinct feedback loops, each with a different latency and cost of a missed signal. Engineers who understand all four can compress them independently.
Tap any card to reveal its rule, bad pattern, and good pattern
The DORA research program identified four metrics that predict software delivery performance. Every one of them is a feedback loop latency measurement, not an arbitrary KPI.
| DORA Metric | What feedback loop it measures | Elite benchmark |
|---|---|---|
| Deployment Frequency | How often you close the real-user feedback loop — how often users get your changes | Multiple times per day |
| Lead Time for Changes | Total pipeline latency: code committed → running in production | < 1 hour |
| Change Failure Rate | Signal quality of your upstream loops — if high, earlier loops aren't catching enough | < 5% |
| MTTR | Speed of your incident response feedback loop: detection → resolution | < 1 hour |
Elite performers deploy at 10× the frequency of low performers
This is not because they are 10× better programmers. It is because they have 10× more opportunities to learn from real-user feedback. Each deployment is a completed feedback loop. Compressing pipeline latency is the mechanism behind DORA improvement — not deploying more for its own sake.
The feedback loop death spiral
Slow or flaky feedback → engineers ignore it → feedback quality degrades further → trust collapses → the loop is functionally broken even though it is technically running. The most common form: CI that takes 45 minutes and fails on flaky tests 20% of the time. Engineers stop looking at failures carefully. Start merging without green CI. The signal exists but carries no information.
4 symptoms your feedback loops are broken
What feedback loops are
📖 What the exam expects
Automated tests and CI/CD pipelines provide feedback on whether code is working correctly before it reaches production.
Toggle between what certifications teach and what production actually requires
Feedback loops appear in disguise across system design and reliability interviews. "How would you improve your CI/CD pipeline?" = "How would you compress the CI feedback loop?" "How do you approach monitoring?" = "How do you design production feedback loops?" Naming the pattern explicitly shows architectural thinking. DORA metric questions are almost always feedback loop questions.
Strong answer: Unprompted mentions of specific DORA metric targets. Distinguishes feedback loop existence from quality and trust. Brings up monitoring independence or references the Facebook BGP incident as a feedback loop design failure. Mentions the death spiral: slow/flaky feedback → ignored → functionally broken.
Red flags: "We have unit tests" without mentioning latency, flakiness, or trust. "Our users tell us about bugs" — the production monitoring loop is absent. "We deploy monthly" — the user feedback loop runs at 1/30th the speed of a daily-deploy team.
Quick check · Feedback loops
1 / 4
Key takeaways
Your CI pipeline takes 52 minutes. How would you identify which stages are slowest, and what patterns (parallelism, test ordering, caching) would you apply to bring it under 10 minutes?
Your team discovers a production bug through a user complaint on social media. Walk through every feedback loop in your delivery pipeline that failed to catch this bug. Which would you fix first?
Your monitoring system is deployed inside the same AWS account and VPC as the application it monitors. What failure modes does this create, and how would you re-architect for independence?
From the books
Atomic Habits
Chapter 15: The Cardinal Rule of Behavior Change
James Clear's habit loop — cue → craving → response → reward — is a feedback loop. The reward IS the feedback signal that tells the brain whether to repeat the behavior. Clear's key insight: "The greater the distance between the behavior and the reward, the harder it is to build the habit." This is identical to the engineering insight: the greater the distance between writing code and getting feedback on it, the harder it is to learn and improve. Compressing your engineering feedback loops — from 45-minute CI to 10-minute CI, from weekly deploys to daily deploys — is compressing the habit loop. You learn faster when the reward is tightly coupled to the behavior.
💡 Analogy
The thermostat vs the "deploy and pray" pipeline
⚡ Core Idea
A thermostat has a 30-second feedback loop: measure temperature → compare to target → adjust heat → repeat. Most deployment pipelines have a 45-minute feedback loop, and no feedback loop at all for "is this feature actually useful to users?" The thermostat is smarter about learning from its environment than most engineering processes.
🎯 Why It Matters
Compressing feedback loops is the single highest-leverage activity in software engineering. A team that deploys multiple times per day and gets feedback in minutes is not more talented — they have more opportunities to learn and correct. Feedback loop speed IS learning speed. Every DORA metric is a feedback loop latency, and improving DORA means compressing loops — not gaming metrics.
Ready to see how this works in the cloud?
Switch to Career Paths for structured paths (e.g. Developer, DevOps) and provider-specific lessons.
View role-based pathsSign in to track your progress and mark lessons complete.
Questions? Discuss in the community or start a thread below.
Join DiscordSign in to start or join a thread.