Ship with progressive delivery: canary and feature flags
Continues from the last build: Promotion to prod is gated and a clean artifact lands every time, but each prod deploy is still all-or-nothing: the new image flips to 100 percent of traffic the instant it rolls, so a bad release reaches every user before anyone reads the first error.
Last rung you earned a gated path to prod: one immutable image, promoted by SHA, approved by a human.
What you'll build
You will turn it from an all-or-nothing prod deploy into a progressive delivery system. The api service ships as an Argo Rollout that canaries a real traffic weight 10/50/100 over the platform mesh, with a Prometheus AnalysisTemplate scoped to the canary pods by pod-template-hash running as a background analysis that begins at the 10 percent step, so a poisoned canary auto-aborts while it touches at most about 10 percent of users and a healthy one auto-promotes through every step with no human. The worker's risky code path hides behind an OpenFeature flag so you can release it to a fraction of jobs, or kill it, without a redeploy. You will drive promotion and abort from your pipeline, prove a healthy canary promotes and a poisoned one rolls back on its own, and flip a flag to release dark code with zero new builds.
See how we teach, before you sign up
You don't just get code dumped on you. Every starter file and every solution is explained line-by-line, in plain English. Here's one real file from this project:
apiVersion: apps/v1
kind: Deployment
metadata:
name: api
namespace: beacon-prod
spec:
replicas: 10
selector:
matchLabels:
app: api
template:
metadata:
labels:
app: api
spec:
containers:
- name: api
image: registry.internal/beacon/api:PLACEHOLDER_SHA
ports:
- containerPort: 8000
envFrom:
- secretRef:
name: api-env
Reading this file
kind: DeploymentThis is the all-or-nothing object. A Deployment has no notion of traffic weight, so any new image goes to everyone at once.image: registry.internal/beacon/api:PLACEHOLDER_SHAThe pipeline substitutes the git short SHA here. The image tag contract from earlier rungs is unchanged; only the controller around it changes.replicas: 10Ten replicas keeps a sane pod count, but the precise traffic split comes from the mesh weight you set later, not from rounding 10 percent onto whole pods.containerPort: 8000The stable :8000 API contract is untouched. Progressive delivery changes how pods receive traffic, never the API surface.
The current prod api Deployment. A normal RollingUpdate means a new image owns all traffic within seconds. You will replace this kind with an Argo Rollout that canaries a real mesh-routed traffic weight.
That's 1 of 7 explained code blocks in this single project.
The build, milestone by milestone
- 1
Convert the api Deployment into a mesh-routed stepped Rollout
4 guided stepsA Deployment can only do all-or-nothing rolling updates. A Rollout with trafficRouting adds the one capability this whole rung depends on: routing a precise, defined fraction of real traffic to the new version before everyone, which is the precondition for catching a bad release with only a known slice of users affected. Without trafficRouting, setWeight is only approximated by replica count and the weight status stays empty.
- 2
Gate the 10 percent canary on a background Prometheus analysis scoped to the canary pods
5 guided stepsThis is the heart of the rung: a poisoned canary must auto-abort while it touches at most about 10 percent of users, and a healthy one must auto-promote with no human. Two settings make that true. startingStep: 1 starts the background analysis at the pause after setWeight 10 (steps are zero-indexed: step 0 is setWeight 10, step 1 is that pause), so the SLO is being judged during the 10 percent window, not only after promotion to 50 percent. Timed pauses (pause: {duration: 5m}) auto-advance on their own, because a background AnalysisRun never advances an indefinite pause: {}; if you left the pauses indefinite, a healthy SHA would stall forever at 10 percent and never promote, and a poisoned SHA would also sit there. Scoping the query to the canary pod-template-hash is the third piece: if the query mixed canary and stable traffic, a small bad slice would be diluted by the larger stable share and the gate would never trip.
- 3
Wrap the worker's risky path behind an OpenFeature flag
4 guided stepsEven a perfect canary still couples shipping code to running it. A feature flag lets you deploy risky code disabled, then turn it on for a fraction of traffic and turn it off instantly if it misbehaves, all without the minutes a redeploy or rollback costs. It is the fastest possible kill switch.
- 4
Drive promotion and abort from the pipeline
4 guided stepsProgressive delivery that only works from a laptop is not delivery. Wiring promotion and abort into the pipeline makes the canary verdict part of the release record: a failed analysis turns the pipeline red, so a rolled-back release is visible and auditable, not a silent kubectl command someone ran.
What's inside when you start
You'll walk away with
This is portfolio-grade. Build it free.
Sign up to unlock every milestone step-by-step, the code skeletons, full reference solutions, and checkable tasks, with your progress saved as you build.
Start building