Back to path
LargePortfolio centerpiece ~33h· 6 milestones

Build a governed multi-agent system

The frontier of the field: multiple specialized agents collaborating on a real workflow, safely enough to deploy and auditable enough to trust.

Multi-agent orchestrationPlanner/worker patternsHuman-in-the-loopGuardrails & governanceObservabilityEvaluationDeploymentChaos & failover engineeringBlameless postmortems

What you'll build

A multi-agent system with orchestration (planner + workers), human-in-the-loop gates, guardrails, full observability, an eval suite, and an audit log, deployed.

See how we teach, before you sign up

You don't just get code dumped on you. Every starter file and every solution is explained line-by-line, in plain English. Here's one real file from this project:

contracts.pypython
from enum import Enum
from pydantic import BaseModel, Field


class Role(str, Enum):
    PLANNER = "planner"
    WORKER = "worker"
    CRITIC = "critic"


class SubTask(BaseModel):
    id: str
    goal: str
    assigned_to: str                      # worker role/name
    allowed_tools: list[str] = Field(default_factory=list)
    requires_human: bool = False          # gated high-impact action


class WorkerResult(BaseModel):
    subtask_id: str
    output: str
    tool_calls: list[str] = Field(default_factory=list)
    error: str | None = None

Reading this file

  • class Role(str, Enum)Names the fixed set of agent roles so responsibilities cannot blur into each other.
  • class SubTask(BaseModel)The typed unit of work agents pass around, so every dispatch is inspectable, not free text.
  • allowed_tools: list[str]Scopes each sub-task to specific tools, enforcing least privilege from the start.
  • requires_human: bool = FalseFlags high-impact sub-tasks that must pause for human approval before running.
  • class WorkerResult(...)The typed result a worker returns, including an error field so failures are explicit, not silent.

The typed message contract, agents pass these, never free-form text, so the system is inspectable.

That's 1 of 10 explained code blocks in this single project.

The build, milestone by milestone

  1. 1

    Design the topology

    5 guided steps

    Multi-agent systems fail at the seams, unclear roles and authority cause agents to duplicate work, contradict each other, or take actions no one authorized. The design is the safety boundary.

  2. 2

    Orchestrate the work

    5 guided steps

    The orchestrator is where intent becomes coordinated action. Bounded autonomy, workers that can’t exceed their mandate, is what makes the whole system controllable.

  3. 3

    Gate risky actions

    5 guided steps

    Autonomy is fine until an action is irreversible or costly. HITL on the dangerous few, guardrails on the routine many, that’s what makes a system deployable rather than a liability.

  4. 4

    Observe & audit

    5 guided steps

    When a multi-agent run does something unexpected, you need to answer “which agent decided what, and why?”. An append-only audit log is also what makes the system defensible to compliance.

  5. 5

    Break it on purpose

    5 guided steps

    A governed system that has never been broken on purpose is untested. You learn far more from deliberately killing a provider or spiking the spend than from a hundred happy-path runs, and the postmortem is how the lesson sticks.

  6. 6

    Evaluate & deploy

    5 guided steps

    A multi-agent system that isn’t scored end-to-end is a science project. Evals plus a real deployment are what turn it into something you can defend in a review and run for users.

What's inside when you start

4 starter files, ready to clone
6 guided milestones
6 full reference solutions
10 code blocks explained line-by-line
6 "is it working?" checks
4 interview questions it prepares you for

You'll walk away with

A deployed multi-agent system with traced, audited runs
An eval suite scoring end-to-end success and safety compliance
A governance/design doc covering roles, authority, guardrails, and human-gated actions
An audit log demonstrating full reconstruction of a consequential run
A chaos experiment report (hypotheses, outcomes, failover + cost-budget results)
A blameless postmortem with a timeline, root cause, and owned follow-up actions

This is portfolio-grade. Build it free.

Sign up to unlock every milestone step-by-step, the code skeletons, full reference solutions, and checkable tasks, with your progress saved as you build.

Start building