MediumWorking system ~17h· 6 milestones

Build and evaluate a RAG assistant over real docs

A support team is drowning in repetitive questions answered in a 400-page docs site.

RAG architectureEmbeddings & vector searchPrompt designPII detectionLLM evaluationContainerizationCost & latency observabilityRunbook / incident response

Build this free Browse all projectsNo credit card. Already a member?

What you'll build

A retrieval-augmented assistant with a vector index over real documents, grounded answers with citations, PII redaction, and an evaluation harness that scores answer quality.

See how we teach, before you sign up

You don't just get code dumped on you. Every starter file and every solution is explained line-by-line, in plain English. Here's one real file from this project:

docker-compose.ymlyaml

services:
  db:
    image: pgvector/pgvector:pg16
    environment:
      POSTGRES_PASSWORD: rag
      POSTGRES_DB: rag
    ports:
      - "5432:5432"
    volumes:
      - pgdata:/var/lib/postgresql/data
volumes:
  pgdata:

Reading this file

image: pgvector/pgvector:pg16A Postgres image with the vector extension built in, so similarity search works out of the box.
POSTGRES_PASSWORD: ragSets the database password via config, fine for local dev but never hardcode real secrets like this in production.
ports: - "5432:5432"Exposes the database port to your host so your code can connect to it locally.
volumes: - pgdata:...Persists the data to a named volume so your index survives container restarts.

A one-command vector store. Reviewers run `docker compose up -d` and the index has somewhere to live.

That's 1 of 10 explained code blocks in this single project.

The build, milestone by milestone

1
Ingest & chunk
5 guided steps
Retrieval can only return what you indexed well. Bad chunking (splitting mid-sentence, losing headings) silently caps the ceiling of every answer downstream.
2
Embed & index
5 guided steps
This is the search engine under the assistant. If retrieval returns the wrong chunks, no amount of prompt cleverness saves the answer.
3
Ground the generation
5 guided steps
Grounding plus citations is what makes the assistant trustworthy. An answer you cannot trace to a source is just a confident guess wearing a uniform.
4
Add safety
5 guided steps
A RAG system ingests untrusted documents and handles user data, both are attack surfaces. PII handling and injection defense are table stakes before this touches real users.
5
Evaluate honestly
5 guided steps
Without evals you cannot tell whether a "fix" helped or hurt. A scored eval set turns RAG tuning from guesswork into engineering.
6
Instrument cost, latency & failure
5 guided steps
A RAG service spends money and adds latency on every query, and it fails in operator-hostile ways (empty retrieval, provider 429s, a re-index that tanks recall). Observability plus a runbook are what turn a 2am "it's slow and expensive" page into a 5-minute fix.

What's inside when you start

4 starter files, ready to clone

6 guided milestones

6 full reference solutions

10 code blocks explained line-by-line

6 "is it working?" checks

4 interview questions it prepares you for

You'll walk away with

A running RAG service with a documented API

An eval report with before/after scores on held-out questions

A note on chunking, retrieval, and safety trade-offs

A cost/latency report (per-query cost, p95 latency, cache hit rate) with stated budgets

A one-page operational runbook for retrieval/model/cost failures

This is portfolio-grade. Build it free.

Sign up to unlock every milestone step-by-step, the code skeletons, full reference solutions, and checkable tasks, with your progress saved as you build.

Start building

Build and evaluate a RAG assistant over real docs

What you'll build

See how we teach, before you sign up

The build, milestone by milestone

Ingest & chunk

Embed & index

Ground the generation

Add safety

Evaluate honestly

Instrument cost, latency & failure

What's inside when you start

You'll walk away with

This is portfolio-grade. Build it free.