LargePortfolio centerpiece ~33h· 7 milestones

Ship a production agentic assistant with guardrails

The RAG bot worked, so now the business wants it to *act*, look things up, call internal tools, and complete multi-step tasks.

Agent architectureTool/function callingGuardrails & safetyObservability & tracingEval harnessCost/latency budgetingDeploymentChaos / failover testingMulti-model fallback & circuit breakersBlameless postmortems

Build this free Browse all projectsNo credit card. Already a member?

What you'll build

A multi-tool agent that plans and executes tasks, with tool-call guardrails, PII/injection defenses, full tracing, an eval suite, and latency/cost budgets, packaged and deployed.

See how we teach, before you sign up

You don't just get code dumped on you. Every starter file and every solution is explained line-by-line, in plain English. Here's one real file from this project:

agent/loop.pypython

import time

MAX_STEPS, MAX_SECONDS, MAX_COST_USD = 8, 60, 0.50


def run(task: str, step_fn) -> dict:
    """step_fn(history) -> {'done': bool, 'tool': str|None, 'args': dict, 'cost': float}"""
    history, spent, t0, last = [], 0.0, time.perf_counter(), None
    for step in range(MAX_STEPS):
        if time.perf_counter() - t0 > MAX_SECONDS:
            return {"status": "timeout", "history": history}
        if spent > MAX_COST_USD:
            return {"status": "cost_capped", "history": history}
        out = step_fn(history)
        spent += out.get("cost", 0.0)
        sig = (out.get("tool"), str(out.get("args")))
        if sig == last and out.get("tool"):      # stuck calling the same thing
            return {"status": "loop_detected", "history": history}
        last = sig
        history.append(out)
        if out["done"]:
            return {"status": "ok", "history": history}
    return {"status": "max_steps", "history": history}

Reading this file

MAX_STEPS, MAX_SECONDS, MAX_COST_USDThree independent limits, because a runaway agent finds creative ways to blow past whichever one you forgot.
if time.perf_counter() - t0 > MAX_SECONDSA wall-clock timeout so a slow task cannot hang the system indefinitely.
if spent > MAX_COST_USDA hard dollar ceiling, the backstop that stops an agent from quietly running up a bill.
if sig == last and out.get("tool")Detects the agent repeating the same call and breaks out, catching loops the global caps would only catch slowly.

The safety chassis: caps on steps, wall-clock, and cost, plus repeated-call detection.

That's 1 of 11 explained code blocks in this single project.

The build, milestone by milestone

1
Design the agent loop
5 guided steps
The loop is the safety chassis. Without hard step/time/cost limits, a confused agent will happily burn your budget in an infinite tool-calling spiral.
2
Wire real tools
5 guided steps
Tools are where an agent gains power, and risk. Least-privilege scoping and per-call validation are what stand between a helpful agent and one that deletes production data.
3
Make it observable
5 guided steps
Agents fail in non-obvious, multi-step ways. Without end-to-end tracing, every bug report is "it gave a weird answer once" with no way to reproduce it.
4
Harden it
5 guided steps
An acting agent with no guardrails is a liability. Injection defense, PII handling, and a human gate on high-impact actions are what make it safe to put in front of real users.
5
Evaluate & budget
5 guided steps
Task success rate and cost-per-task are the numbers that decide whether this ships. "It works in the demo" is not an answer leadership accepts for an agent that spends money per run.
6
Deploy
5 guided steps
An agent that only runs on your laptop is a notebook, not a system. Packaging and deploying it is what makes it something a team could actually operate.
7
Break it on purpose, then write it up
5 guided steps
An agent in production will face a provider outage, a 429 storm, a hung tool, and a cost spike, not "if" but "when". Injecting those failures deliberately, while you are watching, is the only way to know your fallbacks and breakers actually fire instead of cascading into a stuck or runaway agent.

What's inside when you start

4 starter files, ready to clone

7 guided milestones

7 full reference solutions

11 code blocks explained line-by-line

7 "is it working?" checks

4 interview questions it prepares you for

You'll walk away with

A deployed agent service with traced runs

An eval suite scoring task success and safety

A design doc covering guardrails and the cost/latency budget

A chaos/failover drill report covering provider, tool, and cost-spike failures with time-to-detect/recover

A blameless postmortem (timeline, root cause, action items) of the worst drill

This is portfolio-grade. Build it free.

Sign up to unlock every milestone step-by-step, the code skeletons, full reference solutions, and checkable tasks, with your progress saved as you build.

Start building

Ship a production agentic assistant with guardrails

What you'll build

See how we teach, before you sign up

The build, milestone by milestone

Design the agent loop

Wire real tools

Make it observable

Harden it

Evaluate & budget

Deploy

Break it on purpose, then write it up

What's inside when you start

You'll walk away with

This is portfolio-grade. Build it free.