Threat Modeling: Finding Security Flaws Before You Write Code
A structured way to find the security holes in a design before they become bugs in production. The four questions, data-flow diagrams with trust boundaries, and STRIDE, explained from zero.
Most security work happens too late. A penetration tester finds a hole two days before launch. A bug-bounty report lands six months after the feature shipped. By then the flaw is baked into the schema, the API contract, and three downstream services, and fixing it means a migration, not a one-line patch.
The cheapest security flaw to fix is the one you find on a whiteboard. Threat modeling is the discipline of looking at a design, before the code exists, and asking, systematically, "how would someone abuse this?" It is not a tool you buy or a scan you run. It is a structured conversation, and the structure is what stops it from devolving into hand-waving.
The payoff is leverage. An hour spent modeling a design can surface flaws that would otherwise cost weeks to remediate after they ship. You are not trying to find every bug, you are trying to find the *design* bugs, the ones no amount of careful coding can save you from.
Who this is for
Engineers, tech leads, and anyone who reviews designs. You do not need to be a security specialist. If you can sketch how data moves through your system, you can threat model it. This pairs naturally with [What Is Application Security?](/blog/what-is-application-security) and [The OWASP Top 10, Explained](/blog/the-owasp-top-10-explained).
The mental model: you're planning the heist, to stop it
Threat modeling is a structured way to think about what can go wrong with a system, so you can decide what to do about it before an attacker decides for you.
Picture the planning scene in every heist movie. The crew spreads the blueprints on the table. Where are the cameras? When do the guards rotate? Which door is reinforced, which one is just for show? Where does the alarm wire run? They map the building, then they walk every path an intruder could take.
Threat modeling is that same scene, except you own the vault. You are the one with the blueprints, and you are walking the attacker's paths *first* so you can put a guard on the weak door before anyone tries it. The attacker only has to find one way in. You have to find them all before they do, and the only way to be systematic about that is to map the building and check every entrance on purpose.
The building blueprints on the tableA data-flow diagram of your system
Walls, fences, locked doors between zonesTrust boundaries between components
Walking every path an intruder could takeEnumerating threats with STRIDE
Reinforcing the weak door, adding a cameraMitigations: auth, validation, encryption, logging
Re-casing the place after a remodelRe-modeling when the design changes
The heist-planning frame, mapped to threat modeling.
Draw the system: a data-flow diagram with trust boundaries
You cannot reason about what can go wrong until you agree on what you are building. The standard picture for this is a data-flow diagram (DFD): the components, the data moving between them, and, critically, the trust boundaries where the level of trust changes.
A trust boundary is any line data crosses where the thing on the other side is less trusted than the thing on this side. The browser is outside your trust. The public internet is outside your trust. Even the hop from your API to your database is a boundary worth marking, because the credentials, network, and assumptions differ. Threats cluster on trust boundaries, that is where untrusted input meets trusted logic, and that is where you look first.
A simple DFD: the user is outside the boundary; each arrow that crosses a boundary is where you hunt for threats.
Each labelled arrow that crosses a boundary is a data flow, and every data flow is a candidate for abuse. With the picture agreed, you run the process, and the process is four questions.
1
What are we building?
Draw the DFD. List the components, the data flows between them, and the trust boundaries. If you cannot draw it, you do not understand it well enough to secure it. This step alone often surfaces forgotten components, that admin endpoint nobody mentioned, the third-party callback.
2
What can go wrong?
Walk each data flow and each component and ask how it could be abused. This is where STRIDE comes in, a checklist so you do not just brainstorm randomly and miss whole categories of threat.
3
What are we going to do about it?
For each threat that matters, pick a response: mitigate (add a control), eliminate (remove the feature), transfer (push the risk elsewhere, e.g. a managed service), or accept (document it and move on). Not every threat earns a fix, but every threat earns a decision.
4
Did we do a good job?
Review the model against the built system. Were the mitigations actually implemented? Did the design change since you modeled it? A threat model that does not get revisited is a snapshot of a system that no longer exists.
STRIDE: the checklist for "what can go wrong"
"What can go wrong" is the hard question, because an open-ended brainstorm always misses something. STRIDE is a mnemonic that turns it into a checklist: for each component and data flow, you ask all six questions. It was developed at Microsoft and remains the most widely used framework because it maps cleanly onto the properties you actually want a system to have.
Each letter is the *violation* of a security property: Spoofing breaks authentication, Tampering breaks integrity, Repudiation breaks accountability, Information disclosure breaks confidentiality, Denial of service breaks availability, and Elevation of privilege breaks authorization.
Threat
What it is
Example
Mitigation
**S**poofing
Pretending to be someone or something you are not
Reusing a stolen session token to act as another user
Strong authentication, short-lived tokens, MFA
**T**ampering
Modifying data or code without authorization
Editing a hidden `price` field in a checkout request
Server-side validation, integrity checks, signing
**R**epudiation
Denying an action you actually performed
A user claims they never placed the fraudulent order
Append-only audit logs, signed receipts
**I**nformation disclosure
Exposing data to people who should not see it
An API returns full SSNs in a verbose error message
A flood of expensive search queries exhausts the DB
Rate limiting, quotas, timeouts, autoscaling
**E**levation of privilege
Gaining capabilities you should not have
A normal user hits an admin endpoint and it works
Authorization checks on every action, deny by default
STRIDE, the six threat categories, with a concrete example and a typical mitigation for each.
Pro tip
Quick mapping: external entities (users) are mainly subject to **S** and **R**. Data flows over the wire are **T**, **I**, and **D**. Processes (your services) can be hit by all six. You do not have to apply all six to everything, apply the ones that fit the element type.
Worked example: STRIDE on one data flow
Theory clicks when you run it on something real. Take a single flow from the diagram above: the browser sending a checkout request to the API to place an order. One arrow, crossing the public trust boundary. We walk all six STRIDE letters against just this flow.
Spoofing, Could someone place an order as another user? If the token is long-lived or transmitted insecurely, yes. *Fix: short-lived tokens over HTTPS only, bound to the session.*
Tampering, Notice total is sent by the client. An attacker edits it to 1.00 and pays a dollar for a $149 cart. *Fix: never trust client-supplied prices, recompute the total server-side from the cart.*
Repudiation, The user later disputes the charge and claims they never ordered. *Fix: write an append-only audit event (who, what, when, source IP) the moment the order is accepted.*
Information disclosure, Does the response leak other users' addresses or full payment details? *Fix: return only the fields this user owns; scrub internal IDs and stack traces from errors.*
Denial of service, Can someone hammer checkout to exhaust inventory locks or the DB? *Fix: rate-limit per account and per IP; put timeouts on downstream calls.*
Elevation of privilege, Can a user order on behalf of an account they do not own by swapping cartId? *Fix: authorize that the cart belongs to the authenticated user before processing, do not trust the ID alone.*
One arrow, six questions, six concrete findings, two of which (the client-supplied total and the unchecked cartId) are real, common, design-level bugs that no linter would catch. That is the entire value of threat modeling in miniature: structure turns "seems fine" into a list of decisions.
Common mistakes that waste the effort
Boiling the ocean. Trying to model the entire system in one marathon session produces a sprawling diagram nobody finishes. Scope it: model one feature, one new flow, or the riskiest boundary. A small model that ships beats a perfect one that never does.
No trust boundaries. A DFD without boundaries is just a box-and-arrow drawing. The boundaries are the whole point, they tell you *where* untrusted meets trusted, which is *where* the threats live. Draw them first.
Threat-model-once. Modeling the design at kickoff and never touching it again gives you a document about a system that no longer exists. Re-model when the design changes, a new integration, a new data store, a new auth flow.
No follow-through. A list of threats with no owners, no tickets, and no verification is theatre. Every threat you decide to mitigate becomes a tracked task, and step four ("did we do a good job?") checks it actually got done.
Treating it as a security-team-only ritual. The people who designed the system understand it best. Threat modeling works best as a 60-minute conversation the *builders* run, with security in the room as a guide, not a gate they throw the design over.
Takeaways
The whole article in seven lines
Threat modeling finds design-level security flaws before code exists, the cheapest time to fix them.
It runs on four questions: what are we building, what can go wrong, what do we do about it, did we do a good job.
Start by drawing a data-flow diagram and marking the trust boundaries, threats cluster where trust changes.
STRIDE turns "what can go wrong" into a six-item checklist: Spoofing, Tampering, Repudiation, Information disclosure, Denial of service, Elevation of privilege.
Apply STRIDE per data flow and per component, not as one vague brainstorm.
Every threat earns a decision: mitigate, eliminate, transfer, or accept, and mitigations become tracked tasks.
Scope small, draw boundaries, re-model on change, and follow through. A small model that ships beats a perfect one that doesn't.
Where to go next
Threat modeling tells you *where* to look. The next step is knowing the concrete flaws to look for, and how to prove your mitigations actually work in production.
Close the loop on the Repudiation and DoS findings with Security Logging & Monitoring, your audit log and rate limits only help if you can see them.
See where this fits in a career with the DevOps Engineer path, where designing secure systems is a senior-level expectation.
Want to go deeper?
This article covers concepts taught hands-on in the Cloud Engineer and DevOps career paths, with real terminal labs, production scenarios, and structured lessons.