Run a resilient multi-service app with progressive delivery + chaos
The full cloud-native picture: multiple services, packaged with Helm and delivered via GitOps so the cluster always matches Git, released progressively through the mesh with automated promotion or rollback on real metrics, observable end-to-end, and proven resilient because you deliberately broke it and watched it degrade gracefully instead of falling over.
What you'll build
A Helm-packaged, multi-service application on Kubernetes delivered via GitOps, with mesh-based canary releases that auto-rollback on bad metrics, strict mTLS, full observability (Kiali + Grafana), autoscaling, and a documented chaos experiment proving the system degrades gracefully under pod kills and injected latency.
See how we teach, before you sign up
You don't just get code dumped on you. Every starter file and every solution is explained line-by-line, in plain English. Here's one real file from this project:
apiVersion: v2 name: platform description: Multi-service app (frontend, API, worker) type: application version: 0.1.0 appVersion: "1.0.0"
Reading this file
apiVersion: v2Marks this as a Helm 3 chart, the current chart format you should be using.type: applicationSays this chart deploys a runnable app, as opposed to a library chart meant only for reuse.version: 0.1.0The chart's own version, bump it on every change so GitOps and Helm see a new release.appVersion: "1.0.0"Tracks the version of the app being deployed, separate from the chart version itself.
The chart identity. Bump version on every change so GitOps sees a new release.
That's 1 of 10 explained code blocks in this single project.
The build, milestone by milestone
- 1
Package the app with Helm
5 guided stepsHand-maintained YAML per service per environment doesn’t scale and drifts. A Helm chart makes the whole app one versioned, parameterized unit, the artifact GitOps will deploy.
- 2
Deliver via GitOps
5 guided stepsGitOps makes deployments auditable, reproducible, and self-correcting, drift is reverted automatically, rollback is a git revert, and every change has a reviewer and a history.
- 3
Release progressively with metric-gated promotion
5 guided stepsManual canary watching doesn’t scale and humans miss tail regressions. Tying promotion to live success-rate/latency metrics makes releases safe by default, bad versions roll themselves back.
- 4
Observe end-to-end
5 guided stepsA multi-service system fails in ways no single service’s logs explain. End-to-end observability, golden signals plus distributed traces, is how you find which hop in a chain is actually slow or failing.
- 5
Model the cost and load-test to failure
5 guided stepsA multi-service platform with a mesh, GitOps controller, and observability stack has real and often surprising running costs, and you don’t know its capacity until you push it past the point where it falls over.
- 6
Break it on purpose
6 guided stepsResilience you haven’t tested is a hope, not a property. A chaos experiment turns “it should survive a pod dying” into evidence, and surfaces the gaps (missing retries, no PDB, a single point of failure) before users do, while a blameless postmortem turns each gap into a tracked fix instead of a one-off scare.
What's inside when you start
You'll walk away with
This is portfolio-grade. Build it free.
Sign up to unlock every milestone step-by-step, the code skeletons, full reference solutions, and checkable tasks, with your progress saved as you build.
Start building