Preventing Injection Attacks (SQLi, XSS & Friends)

On this page

A search box that ran DROP TABLE
The principle: data is not code
The picture: input crossing the boundary
The injection family at a glance
Code: the vulnerable query and the fix
Why allowlists beat blocklists
Common mistakes that get people breached
Takeaways
Where to go next

The principle: data is not code

Treat all input as hostile. Your job is to make sure data stays data, never let it cross the line and become code.
The one rule behind every injection fix

Every injection vulnerability is the same bug wearing different clothes. Somewhere, your code builds a string that is part structure (the query, the markup, the command) and part untrusted value (what the user typed). When those two are concatenated into one blob and handed to an interpreter, the interpreter has no way to tell where your intent ends and the attacker's begins. A stray quote, angle bracket, or semicolon flips the value into structure, and now the user is writing your code.

The fix is not to scrub away every dangerous character. The fix is to keep data and code in separate channels so the interpreter never has to guess. That is what parameterized queries, output encoding, and argument arrays all do, different mechanisms, one idea.

A mail-merge template: "Dear {name}, your order ships today."A query template: SELECT * FROM users WHERE name = ?

The {name} field is filled in from a spreadsheet of addresses.The ? placeholder is filled in with the user's input as a value.

Someone writes a macro into the name column, and the merge runs it.Concatenated input becomes executable SQL/HTML/shell instead of a value.

Fix: the field is always text, never instructions, no matter what's in it.Fix: the input is always bound as a value, never parsed as code.

Injection is a mail-merge gone rogue.

The picture: input crossing the boundary

Picture untrusted input arriving at a trust boundary, the edge of your system. Everything past that boundary is your code and your data store. The boundary's job is to decide *how* the input is allowed through. Take the safe path and the input is bound, escaped, or rejected; take the unsafe path, string concatenation, and the input flows straight into the interpreter as code.

Untrusted input hits a boundary. The safe path keeps data as data; the unsafe path lets it become code.

1
Input arrives
A request carries values you did not write, query params, JSON body, headers, cookies. Assume every one is attacker-controlled.
2
Hit the boundary
Before the value touches any interpreter, route it through validation and a safe-construction mechanism. This is the only place the decision is made.
3
Validate the shape
Check the value is what you expect, an integer ID, an email, one of a fixed set of sort columns. Reject anything that fails (allowlist), don't try to clean it.
4
Bind, don't build
Pass the value as a parameter to a prepared statement, or encode it for the exact output context. The interpreter receives structure and data on separate channels.
5
Execute safely
The query runs with the input as a literal value; the page renders it as visible text; the command treats it as a single argument. Data never became code.

The injection family at a glance

Three interpreters, three contexts, one bug. Notice that the fix column is the same idea each time, separate the data channel from the code channel, in the language of whatever interpreter you are talking to.

Type	Where it hits	The fix
SQL injection (SQLi)	Database queries built by concatenating input	Parameterized queries / prepared statements, bind input as values
Cross-site scripting (XSS)	HTML/JS rendered with un-encoded input in the browser	Context-aware output encoding; use textContent, not innerHTML; CSP as defense in depth
Command injection	OS commands run via a shell with input in the string	Pass args as an array to the program directly; never invoke a shell with concatenated input

Same root cause, different interpreter.

Code: the vulnerable query and the fix

Here is the classic SQLi bug. The input is glued into the query string, so a crafted value rewrites the query. The infamous ' OR '1'='1 turns a login check into "return every row," and '; DROP TABLE ...;-- runs a second statement entirely.

vulnerable.py

python

# DANGER: input is concatenated straight into SQL
def find_user(username):
    query = "SELECT * FROM users WHERE username = '" + username + "'"
    return db.execute(query)

# Attacker sends:  username = "' OR '1'='1"
# Resulting query: SELECT * FROM users WHERE username = '' OR '1'='1'
#   -> matches every row
#
# Attacker sends:  username = "'; DROP TABLE users;--"
# Resulting query: SELECT * FROM users WHERE username = ''; DROP TABLE users;--'
#   -> runs a second, destructive statement

The fix is a parameterized query. The ? (or %s, or $1, driver-dependent) is a placeholder. You hand the driver the query *and* the values as two separate arguments. The driver sends the SQL structure and the data over different channels, so the value is always treated as a literal, even if it contains quotes, semicolons, or whole SQL statements.

safe.py

python

# SAFE: query structure and values travel separately
def find_user(username):
    query = "SELECT * FROM users WHERE username = ?"
    return db.execute(query, (username,))   # username is bound as a VALUE

# Attacker sends:  username = "' OR '1'='1"
# The driver looks for a user literally named  ' OR '1'='1
#   -> zero rows. The input never becomes SQL.
#
# Rule of thumb: if you are using string concatenation, +, f-strings,
# or .format() to build SQL, you have a bug. Always parameterize.

XSS is the same mistake in the browser. If you drop user input into the page as markup, a <script> payload executes in your visitors' sessions. The fix is context-aware output encoding: encode the value for the exact place it lands. The simplest version, assign to textContent (or let a framework escape by default) so the value renders as visible text, never as HTML.

output-encoding.ts

typescript

// DANGER: input becomes live HTML
function renderComment(comment: string) {
  el.innerHTML = "<p>" + comment + "</p>";
  // comment = '<img src=x onerror="steal(document.cookie)">'
  //   -> the onerror handler runs in your user's session
}

// SAFE: input is rendered as text, not parsed as markup
function renderComment(comment: string) {
  const p = document.createElement("p");
  p.textContent = comment;   // < > & are shown literally, never executed
  el.replaceChildren(p);
}

// In React/Vue/Angular, {value} / {{ value }} encode by default.
// The danger is only when you opt out: dangerouslySetInnerHTML, v-html.
// If you MUST render rich HTML, sanitize first (e.g. DOMPurify),
// and add a Content-Security-Policy as defense in depth.

Why allowlists beat blocklists

When people first meet injection, the instinct is to block the bad stuff, strip out ', ban the word DROP, filter <script>. This is a blocklist, and it loses every time. You are trying to enumerate infinity. Attackers have endless encodings: SeLeCt, URL-encoding, Unicode homoglyphs, nested tags like <scr<script>ipt> that survive a single naive strip, comment tricks, and case games. Miss one and you are owned.

An allowlist flips the logic: define what is *valid* and reject everything else. A user ID must match ^[0-9]+$. A sort column must be one of {"name", "created_at", "price"}. A country must be a known ISO code. You no longer need to imagine every attack, anything that is not explicitly allowed is gone. Validation is about *shape and meaning*; it complements (never replaces) parameterizing and encoding.

allowlist.py

python

# Allowlist: only known-good values get through.
ALLOWED_SORT = {"name", "created_at", "price"}

def list_products(sort_by):
    if sort_by not in ALLOWED_SORT:        # reject, don't sanitize
        sort_by = "name"
    # Identifiers (column names) can't be parameterized, so an
    # allowlist is exactly the right tool here.
    query = f"SELECT * FROM products ORDER BY {sort_by}"
    return db.execute(query)

Use both layers

Parameterize/encode is your primary defense, it makes injection structurally impossible at the interpreter. Allowlist validation is the second layer, and the only option when you can't parameterize (e.g. column or table names). Defense in depth means an attacker has to beat both.

Common mistakes that get people breached

String-concatenated queries. Any +, f-string, or .format() building SQL is a latent SQLi. Parameterize every value, no exceptions, even "internal" or "admin-only" endpoints.
Blocklist filtering. Stripping ', banning keywords, or regex-matching "bad" patterns. Attackers route around blocklists with encoding and casing tricks. Allowlist the valid shape instead.
`innerHTML` with user data. Assigning untrusted input to innerHTML (or dangerouslySetInnerHTML / v-html) parses it as live markup. Use textContent, or sanitize with a vetted library if rich HTML is truly required.
Trusting client-side validation. The browser check is UX, not security, attackers call your API directly and skip it entirely. Always re-validate and parameterize on the server.
Shelling out with concatenated strings. os.system("ping " + host) is command injection. Pass arguments as an array (subprocess.run(["ping", host])) so no shell parses the input.

Takeaways

The whole article in seven lines

Injection = untrusted input getting treated as code. It is the #1 root-cause class.
One rule fixes the whole family: **never let data become code**, keep them in separate channels.
SQLi → parameterized queries. XSS → context-aware output encoding. Command injection → argument arrays, no shell.
Validate input as an allowlist (known-good shape); reject the rest. Never blocklist.
Allowlists beat blocklists because you can't enumerate every attack, only every valid value.
Client-side validation is UX; the server must re-validate and parameterize.
Layer it: parameterize/encode as the primary defense, allowlist validation as the backstop, CSP as defense in depth.

Where to go next

Injection is one slice of a bigger picture. See how it ranks and connects in The OWASP Top 10, Explained, then learn where these defenses sit at the edge of a service in Secure API Design.

Bake validation and parameterization into your service templates from day one, follow the DevOps Engineer path to see where security checks fit in the pipeline.
Practice spotting unsafe string-building in code review, treat every concatenated query, command, or markup string as a finding.
Add a Content-Security-Policy header to your apps as an XSS backstop, and wire dependency/secret scanning into CI so injection isn't your only line of defense.

Want to go deeper?

This article covers concepts taught hands-on in the Cloud Engineer and DevOps career paths, with real terminal labs, production scenarios, and structured lessons.

Explore Career Paths Try the Labs

Keep reading

Cloud

Cloud Identity & Access (IAM) From First Principles

Read

DevOps

Securing the Software Supply Chain (SLSA, SBOM, Signing)

Read

Security

Security as a Non-Functional Requirement

Read

Preventing Injection Attacks (SQLi, XSS & Friends)

A search box that ran DROP TABLE

The principle: data is not code

The picture: input crossing the boundary

The injection family at a glance

Code: the vulnerable query and the fix

Why allowlists beat blocklists

Common mistakes that get people breached

Takeaways

Where to go next

Want to go deeper?

Cloud Identity & Access (IAM) From First Principles

Securing the Software Supply Chain (SLSA, SBOM, Signing)

Security as a Non-Functional Requirement