Prompt Engineering Fundamentals

On this page

You typed a question and got mush
The one principle everything else follows from
The anatomy of a good prompt
The core techniques, and when to reach for each
A real before and after
Common mistakes that quietly wreck results
Takeaways
Where to go next

You typed a question and got mush

You ask the model to "summarize this support ticket" and it hands back five paragraphs when you wanted two bullet points. You ask for JSON and get JSON wrapped in an apology. You change one word and the whole answer falls apart. The model is clearly capable, your colleague got a perfect result from the *same* model an hour ago, so the gap isn't the model. It's the instructions.

Prompt engineering is the skill of writing those instructions so the model does what you actually meant, reliably, the tenth time as well as the first. It's less "magic words" and more "clear technical writing for a very fast, very literal reader." This article covers the techniques that survive model upgrades, the ones still true on next year's model, and is honest about the tricks that are mostly folklore.

Who this is for

Anyone who has used ChatGPT or an API and gotten inconsistent results. No ML background needed. If you've read [How LLMs Actually Work](/blog/how-llms-actually-work) you'll have the mental model for *why* these techniques work; if not, you can still apply every one of them today.

The one principle everything else follows from

A model can only act on what's in front of it. It has no memory of your intent, your codebase, or your last conversation, only the tokens you sent this turn. Prompting is the craft of putting the right context in front of it.
The whole skill in two sentences

Hold onto an analogy, because it makes every later decision obvious. Think of the model as a brilliant new contractor on their first day, sharp, fast, widely read, but with zero context about *your* situation and a habit of taking everything literally.

You hand a new contractor a one-line task with no contextA bare prompt with no role, examples, or format, output is a guess at what you meant

You explain who they are and what hat to wear todayThe system prompt: role, tone, constraints, what to never do

You show them two finished examples of the workFew-shot examples that pin down format and style by demonstration

You say "walk me through your reasoning before you decide"Chain-of-thought: ask for steps so the model computes instead of guessing

You hand them a labeled folder, not a pile of loose paperDelimiters and structure that separate instructions from data

Brief the model like you'd brief a sharp contractor who just walked in the door.

A literal contractor does exactly what you said, not what you hoped. Every technique below is just a different way of being explicit so the literal reader can't go wrong.

The anatomy of a good prompt

A strong prompt isn't one blob of text. It's a few distinct parts, each doing one job. Picture the request flowing through these parts into the model and back out as a response:

A complete prompt is assembled from parts, sent to the model, and shaped into a response.

Not every prompt needs every part, a quick one-off might be just task + output spec. But when results get flaky, the fix is almost always *one of these parts is missing or muddled*. Here's how to build one deliberately.

1
State the role and rules in the system message
"You are a support-ops assistant. Be concise. Never invent ticket IDs." The system role sets defaults that hold across the whole conversation.
2
Give the task in one clear sentence
Lead with the verb and the object: "Classify this ticket into one of: billing, bug, feature, other." Vague verbs like "handle" or "look at" invite vague output.
3
Separate the data with delimiters
Wrap the ticket text in triple quotes, XML tags, or markdown fences so the model never confuses *what to act on* with *instructions about how to act*.
4
Show the shape you want back
Describe the output exactly, "Return only the category as a lowercase word" or a JSON schema. If format matters, demonstrate it with one example.
5
Add reasoning only if the task needs it
For judgement calls, ask the model to think step by step before answering. For lookups and simple classification, skip it, it just adds latency and cost.

The core techniques, and when to reach for each

Most of prompting is choosing among four moves. They stack, role-prompting plus few-shot plus chain-of-thought is common, but each earns its place only when the task calls for it.

Technique	What it is	When to use it
Zero-shot	Just describe the task in plain language, no examples.	Simple, common tasks the model has clearly seen a million times, translate, summarize, rewrite. Start here always.
Few-shot	Include 2–5 input→output examples before the real input.	When format or style matters and is hard to describe in words. The examples teach by demonstration far better than adjectives.
Chain-of-thought	Ask the model to reason step by step before the final answer.	Multi-step reasoning, math, logic, or judgement calls. Trades latency and tokens for accuracy, not free, often worth it.
Role-prompting	Assign a persona or expertise in the system message.	To set tone, vocabulary, and default assumptions ("senior SRE", "patient tutor"). Useful for voice; do not expect it to add knowledge the model lacks.

Pick the lightest technique that does the job; add more only when output stays unreliable.

Honest about the hype

"You are a world-class expert" does not unlock hidden genius, the model isn't sandbagging until flattered. Role prompts shape *style*, not *capability*. Likewise, padding with "think very carefully" helps far less than actually structuring the prompt. Treat dramatic prompt tricks with suspicion; treat clear instructions and good examples as the real levers.

A real before and after

Here's the same job, turning a support ticket into a structured triage record, written two ways. The first is what people type when they're in a hurry. The second applies role, delimiters, an example, and an output spec.

before.txt

text

Read this ticket and tell me what's wrong and how urgent it is:

Hi, I've been charged twice for my May subscription and the
second charge pushed me over my limit. Please fix ASAP.

That works *sometimes*. But the urgency wording drifts, the category is freeform, and you can't parse the result reliably. Now the engineered version:

after.txt

text

System:
You are a support triage assistant. Classify tickets accurately
and never invent details that aren't in the text.

Classify the ticket below into JSON with exactly these fields:
  category: one of [billing, bug, feature, account, other]
  severity: one of [low, medium, high]
  summary: one sentence, under 20 words

Example
Ticket: """The export button does nothing when I click it."""
Output: {"category":"bug","severity":"medium","summary":"Export button is unresponsive on click."}

Now classify this ticket. Return only the JSON, no prose.
Ticket: """Hi, I've been charged twice for my May subscription and the
second charge pushed me over my limit. Please fix ASAP."""

The second prompt is longer, and that's the point. It fixes the role, constrains the category and severity to closed sets, demonstrates the exact JSON with one example, and isolates the ticket inside delimiters so the model can't mistake the customer's "Please fix ASAP" for an instruction. The output is now parseable every time. When you need that JSON to be *guaranteed* valid rather than just usually-valid, you graduate to Structured Output & Tool Calling.

Common mistakes that quietly wreck results

Mixing instructions and data with no separator, the model treats text inside your data as commands (this is also the seed of prompt injection).
Asking for a format without showing it. "Return a table" is ambiguous; one example removes all doubt.
Negative-only instructions. "Don't be verbose" is weaker than "Answer in at most two sentences", tell it what to do, not just what to avoid.
Cramming five tasks into one prompt. Split unrelated jobs into separate calls; accuracy drops fast when a single prompt juggles too much.
Over-engineering simple tasks. If zero-shot already works, adding examples and chain-of-thought just burns tokens and latency.
Tuning on a single example. A prompt that nails *one* input often breaks on the next, you need a test set, not a lucky screenshot.

Iterate against evals, not vibes

The single biggest upgrade to your prompting is to stop eyeballing one output and start scoring many. Collect 20–50 real inputs with known-good answers, run each prompt change against all of them, and keep what raises the score. That's how you tell a real improvement from a coincidence, covered in depth in [Evaluating LLM Applications](/blog/evaluating-llm-applications).

Takeaways

Prompt engineering in nine lines

Treat the model as a brilliant, literal contractor on day one, give it everything it needs, assume nothing.
A good prompt has parts: system role, context, examples, task, output spec. Missing results trace back to a missing part.
Start zero-shot. Add few-shot when format matters, chain-of-thought when the task needs reasoning.
Examples teach format better than any adjective, show, don't just describe.
Separate instructions from data with delimiters, always.
Tell the model what to do, not only what to avoid.
Role prompts shape style, not capability, "world-class expert" is mostly hype.
Don't over-engineer simple tasks; the lightest technique that works wins.
Improve against an eval set, not a single lucky output.

Where to go next

Prompting is the first skill on the AI Engineer track because everything downstream, tool use, retrieval, agents, is still, underneath, the craft of putting the right tokens in front of the model. Strengthen the foundation, then build on it:

How LLMs Actually Work, the mental model for *why* delimiters, examples, and reasoning prompts change the output.
Structured Output & Tool Calling, go from "usually valid JSON" to guaranteed schemas and real function calls.
Evaluating LLM Applications, build the test set that turns prompt tweaking from guesswork into engineering.
The AI Engineer path, the full track, from prompting fundamentals through production LLM systems.

Want to go deeper?

This article covers concepts taught hands-on in the Cloud Engineer and DevOps career paths, with real terminal labs, production scenarios, and structured lessons.

Explore Career Paths Try the Labs

Keep reading

AI Engineering

RAG Architecture Explained for Backend Engineers

Read

AI Engineering

What Is an AI Engineer?

Read

AI Engineering

How LLMs Actually Work (for Engineers)

Read