Build a governed multi-agent system
The frontier of the field: multiple specialized agents collaborating on a real workflow, safely enough to deploy and auditable enough to trust.
What you'll build
A multi-agent system with orchestration (planner + workers), human-in-the-loop gates, guardrails, full observability, an eval suite, and an audit log, deployed.
See how we teach, before you sign up
You don't just get code dumped on you. Every starter file and every solution is explained line-by-line, in plain English. Here's one real file from this project:
from enum import Enum
from pydantic import BaseModel, Field
class Role(str, Enum):
PLANNER = "planner"
WORKER = "worker"
CRITIC = "critic"
class SubTask(BaseModel):
id: str
goal: str
assigned_to: str # worker role/name
allowed_tools: list[str] = Field(default_factory=list)
requires_human: bool = False # gated high-impact action
class WorkerResult(BaseModel):
subtask_id: str
output: str
tool_calls: list[str] = Field(default_factory=list)
error: str | None = NoneReading this file
class Role(str, Enum)Names the fixed set of agent roles so responsibilities cannot blur into each other.class SubTask(BaseModel)The typed unit of work agents pass around, so every dispatch is inspectable, not free text.allowed_tools: list[str]Scopes each sub-task to specific tools, enforcing least privilege from the start.requires_human: bool = FalseFlags high-impact sub-tasks that must pause for human approval before running.class WorkerResult(...)The typed result a worker returns, including an error field so failures are explicit, not silent.
The typed message contract, agents pass these, never free-form text, so the system is inspectable.
That's 1 of 10 explained code blocks in this single project.
The build, milestone by milestone
- 1
Design the topology
5 guided stepsMulti-agent systems fail at the seams, unclear roles and authority cause agents to duplicate work, contradict each other, or take actions no one authorized. The design is the safety boundary.
- 2
Orchestrate the work
5 guided stepsThe orchestrator is where intent becomes coordinated action. Bounded autonomy, workers that can’t exceed their mandate, is what makes the whole system controllable.
- 3
Gate risky actions
5 guided stepsAutonomy is fine until an action is irreversible or costly. HITL on the dangerous few, guardrails on the routine many, that’s what makes a system deployable rather than a liability.
- 4
Observe & audit
5 guided stepsWhen a multi-agent run does something unexpected, you need to answer “which agent decided what, and why?”. An append-only audit log is also what makes the system defensible to compliance.
- 5
Break it on purpose
5 guided stepsA governed system that has never been broken on purpose is untested. You learn far more from deliberately killing a provider or spiking the spend than from a hundred happy-path runs, and the postmortem is how the lesson sticks.
- 6
Evaluate & deploy
5 guided stepsA multi-agent system that isn’t scored end-to-end is a science project. Evals plus a real deployment are what turn it into something you can defend in a review and run for users.
What's inside when you start
You'll walk away with
This is portfolio-grade. Build it free.
Sign up to unlock every milestone step-by-step, the code skeletons, full reference solutions, and checkable tasks, with your progress saved as you build.
Start building