Back to path
AdvancedBeacon · Project 8 of 12 ~12h· 4 milestones

Ship with progressive delivery: canary and feature flags

Continues from the last build: Promotion to prod is gated and a clean artifact lands every time, but each prod deploy is still all-or-nothing: the new image flips to 100 percent of traffic the instant it rolls, so a bad release reaches every user before anyone reads the first error.

Last rung you earned a gated path to prod: one immutable image, promoted by SHA, approved by a human.

Progressive delivery with canary rolloutsArgo Rollouts as a delivery controllerWeighted traffic routing over a service meshBackground metric analysis scoped to canary pods and automated rollbackFeature flags with OpenFeatureDecoupling deploy from releaseDriving rollout promotion and abort from CISLO-based release gating with sound PromQL

What you'll build

You will turn it from an all-or-nothing prod deploy into a progressive delivery system. The api service ships as an Argo Rollout that canaries a real traffic weight 10/50/100 over the platform mesh, with a Prometheus AnalysisTemplate scoped to the canary pods by pod-template-hash running as a background analysis that begins at the 10 percent step, so a poisoned canary auto-aborts while it touches at most about 10 percent of users and a healthy one auto-promotes through every step with no human. The worker's risky code path hides behind an OpenFeature flag so you can release it to a fraction of jobs, or kill it, without a redeploy. You will drive promotion and abort from your pipeline, prove a healthy canary promotes and a poisoned one rolls back on its own, and flip a flag to release dark code with zero new builds.

See how we teach, before you sign up

You don't just get code dumped on you. Every starter file and every solution is explained line-by-line, in plain English. Here's one real file from this project:

deploy/prod/api-deployment.yamlyaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: api
  namespace: beacon-prod
spec:
  replicas: 10
  selector:
    matchLabels:
      app: api
  template:
    metadata:
      labels:
        app: api
    spec:
      containers:
        - name: api
          image: registry.internal/beacon/api:PLACEHOLDER_SHA
          ports:
            - containerPort: 8000
          envFrom:
            - secretRef:
                name: api-env

Reading this file

  • kind: DeploymentThis is the all-or-nothing object. A Deployment has no notion of traffic weight, so any new image goes to everyone at once.
  • image: registry.internal/beacon/api:PLACEHOLDER_SHAThe pipeline substitutes the git short SHA here. The image tag contract from earlier rungs is unchanged; only the controller around it changes.
  • replicas: 10Ten replicas keeps a sane pod count, but the precise traffic split comes from the mesh weight you set later, not from rounding 10 percent onto whole pods.
  • containerPort: 8000The stable :8000 API contract is untouched. Progressive delivery changes how pods receive traffic, never the API surface.

The current prod api Deployment. A normal RollingUpdate means a new image owns all traffic within seconds. You will replace this kind with an Argo Rollout that canaries a real mesh-routed traffic weight.

That's 1 of 7 explained code blocks in this single project.

The build, milestone by milestone

  1. 1

    Convert the api Deployment into a mesh-routed stepped Rollout

    4 guided steps

    A Deployment can only do all-or-nothing rolling updates. A Rollout with trafficRouting adds the one capability this whole rung depends on: routing a precise, defined fraction of real traffic to the new version before everyone, which is the precondition for catching a bad release with only a known slice of users affected. Without trafficRouting, setWeight is only approximated by replica count and the weight status stays empty.

  2. 2

    Gate the 10 percent canary on a background Prometheus analysis scoped to the canary pods

    5 guided steps

    This is the heart of the rung: a poisoned canary must auto-abort while it touches at most about 10 percent of users, and a healthy one must auto-promote with no human. Two settings make that true. startingStep: 1 starts the background analysis at the pause after setWeight 10 (steps are zero-indexed: step 0 is setWeight 10, step 1 is that pause), so the SLO is being judged during the 10 percent window, not only after promotion to 50 percent. Timed pauses (pause: {duration: 5m}) auto-advance on their own, because a background AnalysisRun never advances an indefinite pause: {}; if you left the pauses indefinite, a healthy SHA would stall forever at 10 percent and never promote, and a poisoned SHA would also sit there. Scoping the query to the canary pod-template-hash is the third piece: if the query mixed canary and stable traffic, a small bad slice would be diluted by the larger stable share and the gate would never trip.

  3. 3

    Wrap the worker's risky path behind an OpenFeature flag

    4 guided steps

    Even a perfect canary still couples shipping code to running it. A feature flag lets you deploy risky code disabled, then turn it on for a fraction of traffic and turn it off instantly if it misbehaves, all without the minutes a redeploy or rollback costs. It is the fastest possible kill switch.

  4. 4

    Drive promotion and abort from the pipeline

    4 guided steps

    Progressive delivery that only works from a laptop is not delivery. Wiring promotion and abort into the pipeline makes the canary verdict part of the release record: a failed analysis turns the pipeline red, so a rolled-back release is visible and auditable, not a silent kubectl command someone ran.

What's inside when you start

3 starter files, ready to clone
4 guided milestones
4 full reference solutions
7 code blocks explained line-by-line
4 "is it working?" checks
5 interview questions it prepares you for

You'll walk away with

An api Rollout manifest replacing the Deployment, with a trafficRouting block over the platform mesh and canary steps that route a real 10, 50, and 100 percent of traffic via timed pauses
A Prometheus-backed AnalysisTemplate scoped to the canary pods by rollouts_pod_template_hash, gating each step on a 99 percent success-rate SLO, with an empty-vector guard, wired into the Rollout as a background analysis with startingStep 1 so it spans the 10 percent window and auto-aborts on breach
The worker's risky renderer path behind an OpenFeature use-new-renderer flag, evaluated per job, shipping dark by default
A prod pipeline job that applies the SHA, blocks on the rollout verdict, and explicitly aborts plus fails red on a degraded canary
A short demo record showing one healthy SHA auto-promoting and one poisoned SHA auto-rolling back at 10 percent, plus a flag flip releasing dark code with no rebuild

This is portfolio-grade. Build it free.

Sign up to unlock every milestone step-by-step, the code skeletons, full reference solutions, and checkable tasks, with your progress saved as you build.

Start building