62 Production Scenarios
Select a skill area, then follow the pipeline from Foundations → Expert.
Foundations
Guided walkthroughs · 8–15 min each
Store objects in Google Cloud Storage
Create a bucket, upload objects, set lifecycle rules and uniform access.
You'll learn: Buckets and objects
Your first AWS Lambda function
Create a Lambda, trigger it on HTTP (API Gateway) or a test event.
You'll learn: Lambda handler
Tag resources for cost allocation
Apply tags (project, env, owner) so you can filter and report on cost by tag.
You'll learn: Tagging strategy
Deploy to Azure App Service with staging slots and zero-downtime swaps
Host a web application on Azure App Service, create a staging deployment slot, deploy the new version there first for smoke testing, then swap it to production with zero downtime using App Service warm-up.
You'll learn: App Service Plans: which tier supports deployment slots (Standard and above)
Deploy a production-ready containerised API to Google Cloud Run
Package a REST API in Docker, push it to Artifact Registry, deploy to Cloud Run with secret injection from Secret Manager, configure traffic splitting for canary releases, and set concurrency limits for cost control.
You'll learn: Cloud Run vs Cloud Functions: when a container beats a function
Professional
Real problem-solving · 15–25 min each
Deploy a 3-tier app on AWS
Web app, app server, and database using EC2, ALB, and RDS.
You'll learn: VPC and subnets
Harden IAM for a web application
Least privilege, roles for EC2/ECS, and policy boundaries.
You'll learn: IAM roles
Serverless API on Azure
HTTP-triggered Function App and API Management.
You'll learn: Azure Functions
Serverless API on Google Cloud
Cloud Functions and API Gateway for HTTP APIs.
You'll learn: Cloud Functions
IAM and service accounts on GCP
Service accounts, roles, and resource hierarchy.
You'll learn: Service accounts
Create a VPC with public and private subnets (AWS)
VPC, subnets, internet gateway, and route tables for a simple two-tier setup.
You'll learn: VPC and CIDR
Ship a containerised app from GitHub to ECS with zero-downtime deployments
Build a full CI/CD pipeline: GitHub Actions builds and scans a Docker image, pushes to ECR, and deploys to ECS Fargate using a rolling update strategy with ALB health-check gating.
You'll learn: GitHub Actions workflow: build, test, scan, push, deploy
Build a resilient event-driven order processing system with SQS and DLQ
Design and implement an order processing pipeline using SQS FIFO queues for deduplication, Lambda consumers with idempotent handlers, SNS fan-out for downstream notifications, and a Dead Letter Queue for failed messages.
You'll learn: SQS FIFO vs Standard: ordering guarantees and deduplication IDs
Cut your cloud bill by 40% with right-sizing and Reserved Instance analysis
Run a structured cost optimization audit: identify over-provisioned instances with CloudWatch metrics, model Reserved Instance savings, set up budget alerts, and implement an automatic stop/start schedule for non-production resources.
You'll learn: AWS Cost Explorer and Compute Optimizer: interpreting right-sizing recommendations
Harden a containerised application: image scanning, secrets, and runtime policies
Apply defence-in-depth to a Docker-based application: remove root privilege, use distroless base images, scan for CVEs in CI, manage secrets with Vault/Secrets Manager, and enforce a runtime security policy with Falco.
You'll learn: Distroless and minimal base images: what attack surface looks like in a 3MB image
Scale an application automatically under real load with EC2 Auto Scaling
Configure an Auto Scaling Group with target tracking (CPU 50%) and scheduled scaling for predictable peaks, then validate it by running a k6 load test that triggers both scale-out and scale-in events.
You'll learn: Target tracking vs step scaling: when to use each policy
Provision a production AWS environment with Terraform and remote state
Write reusable Terraform modules to provision a VPC, security groups, and EC2 instances — using an S3 + DynamoDB remote backend for team-safe state locking and Terraform workspaces to manage dev and prod from one codebase.
You'll learn: Remote state: why local state is dangerous in teams and how S3 + DynamoDB solves it
Protect a web application from attacks with CloudFront and AWS WAF
Place AWS WAF in front of your application via CloudFront: enable Managed Rule Groups (SQLi, XSS, bad bots), add a custom rate-limit rule to stop brute-force attacks, geo-restrict access, and query the WAF logs with Athena.
You'll learn: CloudFront as a security boundary: blocking attacks at the edge before they reach origin
Scale read-heavy workloads with RDS PostgreSQL read replicas
Add read replicas to an RDS PostgreSQL instance to horizontally scale SELECT queries, route reads to replicas and writes to the primary via your application connection pool, and monitor replication lag to detect when a replica falls behind.
You'll learn: RDS read replicas: asynchronous streaming replication and eventual consistency
Auto-remediate cloud threats with GuardDuty and Lambda in 60 seconds
Enable Amazon GuardDuty, wire EventBridge to route HIGH-severity findings to a Lambda that automatically isolates the compromised EC2 instance — replacing its security groups with a forensic deny-all group and snapshotting its disk for analysis.
You'll learn: GuardDuty finding types: credential theft, C2 communication, recon, data exfiltration
Stand up a production-grade EKS cluster with IRSA and the AWS Load Balancer Controller
Create an EKS cluster with managed node groups in private subnets, configure IAM Roles for Service Accounts (IRSA) so pods get least-privilege AWS access, install the AWS Load Balancer Controller, and auto-provision an ALB from a Kubernetes Ingress resource.
You'll learn: Managed node groups vs self-managed: why managed is the right default
Build event-driven workflows with Azure Service Bus topics and Azure Functions
Create a Service Bus topic with multiple subscriptions for fan-out messaging, build Azure Functions triggered per subscription, enforce ordered processing with sessions, and handle poison messages via the Dead Letter Queue.
You'll learn: Service Bus topics vs queues: when fan-out requires a topic with subscriptions
Set up a highly available Cloud SQL database with private IP and failover
Create a Cloud SQL PostgreSQL instance with High Availability (synchronous standby in a different zone), connect exclusively over a private IP via the Cloud SQL Auth Proxy, validate automatic failover by triggering zone simulation.
You'll learn: HA configuration: primary instance with synchronous standby in a different zone
Orchestrate a distributed transaction with AWS Step Functions and the Saga pattern
Implement the Saga pattern for a distributed order: Step Functions orchestrates five Lambda steps (reserve inventory, charge payment, schedule delivery) and automatically runs compensating transactions to roll back completed steps when any step fails.
You'll learn: Saga pattern: why distributed transactions need compensating actions instead of 2PC
Protect a GCP application from DDoS and web attacks with Cloud Armor
Deploy Cloud Armor in front of a Global HTTP(S) Load Balancer: enable OWASP Top 10 preconfigured rules, create a custom rate-limit rule, enable Adaptive Protection for ML-based DDoS detection, and query blocked requests in BigQuery.
You'll learn: Cloud Armor vs firewall rules: WAF at the global load balancer vs VPC firewall
Accelerate and protect a global web app with Azure Front Door and WAF
Set up Azure Front Door as your global entry point: route users to the nearest healthy backend, cache static assets at the edge, enforce WAF policies to block OWASP threats, and configure health probes for automatic failover between regions.
You'll learn: Azure Front Door vs Azure CDN: when global routing intelligence justifies Front Door
Eliminate database bottlenecks with Amazon ElastiCache Redis caching
Deploy ElastiCache Redis in cluster mode, implement the cache-aside pattern in your application, add TTL jitter to prevent cache stampede, tune the eviction policy for cost control, and verify cache hit rate exceeds 90% under load.
You'll learn: Cache-aside vs write-through vs write-behind: choosing the right pattern
Build end-to-end Azure observability with Monitor, Log Analytics, and Workbooks
Instrument an Azure application end-to-end: Application Insights for request telemetry, Log Analytics for centralised logs, metric and log query alerts with dynamic thresholds, Action Groups for Slack and PagerDuty routing, and a Workbook dashboard for on-call.
You'll learn: Azure Monitor data sources: VM metrics, App Insights telemetry, Activity Log
Build a real-time collaborative app backend with Google Cloud Firestore
Use Firestore as a real-time database: design a document schema with subcollections for per-user isolation, write Security Rules that enforce access control, implement real-time listeners, and handle offline persistence for mobile clients.
You'll learn: Firestore data model: documents, collections, and subcollections vs flat structure
Design a DynamoDB single-table model for a multi-entity SaaS application
Apply single-table design to store Users, Organisations, Memberships, and Subscriptions in one DynamoDB table — define access patterns first, use composite keys and GSIs for all query shapes, and enable TTL for automatic session expiry.
You'll learn: Access-pattern-first design: list every query before designing the schema
Make serverless functions production-ready with AWS Lambda Powertools
Use Lambda Powertools to add structured JSON logging, custom CloudWatch metrics via EMF, distributed X-Ray tracing, and a DynamoDB-backed idempotency layer — transforming a black-box Lambda into a fully observable, production-grade function.
You'll learn: Structured logging: why JSON logs are 10x easier to query than plain text in CloudWatch
Build a complete Azure DevOps CI/CD pipeline from code commit to AKS
Wire a multi-stage Azure Pipelines YAML that builds a Docker image, runs tests and SAST scanning, pushes to Azure Container Registry, and deploys to AKS via Helm — with human approval gates between staging and production.
You'll learn: Azure Pipelines YAML: multi-stage pipeline with build, test, and deploy stages
Set Up Terraform Remote State with S3 and DynamoDB Locking
Store Terraform state remotely in S3 with DynamoDB-backed state locking to prevent concurrent apply conflicts in a team environment.
You'll learn: S3 versioning and KMS encryption for state storage
Optimize GitHub Actions: Caching, Parallelism, and Affected-Only Builds
Cut CI pipeline time from 15 minutes to under 5 minutes using actions/cache for node_modules, parallel job splitting, and path filters to skip unnecessary runs.
You'll learn: actions/cache with lock-file hash keys for node_modules
Run Fault-Tolerant Workloads on EC2 Spot Instances with Auto-Restart
Use Spot Instances for batch processing workloads with EventBridge-triggered Lambda auto-restart, cutting compute costs by 70-90% compared to On-Demand pricing.
You'll learn: EC2 Spot Fleet with capacity-optimized allocation
Deploy Prometheus with AlertManager and PagerDuty Integration
Install kube-prometheus-stack via Helm on Kubernetes, write PromQL alerting rules for error rate and latency, and route critical alerts to PagerDuty with inhibition to prevent alert storms.
You'll learn: kube-prometheus-stack Helm installation
Identify and Eliminate AWS Cost Waste: Right-sizing and Idle Resources
Use AWS Cost Explorer, Compute Optimizer, and CloudWatch metrics to identify EC2 over-provisioning, idle resources, and S3 storage waste — then automate the fixes.
You'll learn: Cost Explorer for identifying top cost drivers
Right-size Kubernetes Pods with Resource Requests/Limits and VPA
Identify OOMKilled and CPU-throttled pods, install the Vertical Pod Autoscaler in Recommendation mode, apply its suggestions, and configure namespace LimitRanges to enforce defaults.
You'll learn: Diagnosing OOMKilled and CPU throttle events from kubectl
Expert
Minimal guidance · 20–30 min each
Secure a 3-tier architecture
Network segmentation, encryption, and identity at each tier.
You'll learn: NSGs / security groups
Design a 3-tier app across 5 AWS accounts
Architect a production-ready 3-tier web application using AWS multi-account best practices: separate accounts for security, logging, shared services, non-prod, and production workloads.
You'll learn: Multi-account strategy and AWS Organizations
Design a 3-tier app across 5 Azure subscriptions
Architect a production-ready 3-tier web application using Azure multi-subscription best practices: separate subscriptions for management, logging, connectivity, non-prod, and production workloads.
You'll learn: Multi-subscription strategy and Azure Management Groups
Build a multi-region disaster recovery setup on AWS
Implement a warm-standby DR strategy: Aurora Global Database cross-region replication, Route 53 health-check failover, and S3 cross-region replication — with a documented RTO/RPO.
You'll learn: RTO vs RPO trade-offs for warm standby, pilot light, and active-active
Harden a GKE cluster: RBAC, Network Policies, and Workload Identity
Lock down a Google Kubernetes Engine cluster using least-privilege RBAC, deny-all NetworkPolicies, Workload Identity for pod-level GCP permissions, and Binary Authorization to block unsigned images.
You'll learn: Kubernetes RBAC: ClusterRole, Role, RoleBinding scoped by namespace
Define SLOs and wire up golden-signal alerts for a production service
Apply the SRE model end-to-end: define SLIs/SLOs for a user-facing API, create error-budget burn-rate alerts, and build a dashboard showing the four golden signals (latency, traffic, errors, saturation).
You'll learn: SLI vs SLO vs SLA — what each one means in practice
Run your first chaos engineering experiment with AWS Fault Injection Simulator
Design a hypothesis, inject CPU stress and network latency into a production-like environment using AWS FIS, observe how your application behaves, and use the results to improve resilience before a real failure happens.
You'll learn: Chaos engineering principles: hypothesis-driven experiments, blast radius control
Migrate a PostgreSQL database to Aurora with zero downtime using DMS
Perform a live database migration from a self-managed PostgreSQL instance to Amazon Aurora PostgreSQL using AWS DMS — with full-load, ongoing replication, and a carefully staged cutover that keeps your app live throughout.
You'll learn: AWS DMS: replication instance, endpoints, and task configuration
Deploy a service mesh on AKS with Istio mTLS and traffic shaping
Install Istio on Azure Kubernetes Service, enforce mutual TLS between all services, implement a canary deployment using VirtualService traffic weights, and observe service-to-service traffic through Kiali.
You'll learn: Istio control plane: Istiod and sidecar injection per namespace
Build a real-time data pipeline: Pub/Sub to Dataflow to BigQuery
Stream events from a Pub/Sub topic through a Dataflow (Apache Beam) pipeline that filters, enriches, and windows the data, landing it in BigQuery for real-time analytics — the same pattern used by Google, Spotify, and Twitter.
You'll learn: Pub/Sub push vs pull subscriptions and message acknowledgement
Design an active-active multi-region architecture on AWS
Go beyond DR: build an architecture where both us-east-1 and eu-west-1 serve live user traffic simultaneously using Route 53 latency routing, DynamoDB Global Tables, and a conflict-resolution strategy for writes.
You'll learn: Active-active vs active-passive: the CAP theorem trade-off in practice
Build a globally distributed database with Azure Cosmos DB multi-region writes
Configure Cosmos DB for multi-region writes, design a partition key for even data distribution, test all five consistency levels, implement conflict resolution for simultaneous writes in two regions, and stream changes via Change Feed.
You'll learn: Consistency levels: Strong, Bounded Staleness, Session, Consistent Prefix, Eventual
Event-driven pod autoscaling based on SQS queue depth with KEDA
Install KEDA on Kubernetes and define a ScaledObject that adds worker pods when an SQS queue grows and removes them (to zero) when it drains — achieving event-driven autoscaling that CPU-based HPA alone cannot provide.
You'll learn: KEDA architecture: ScaledObject, external metrics API, and the scaler plugins
Respond to a simulated cloud security incident using AWS native tools
Work through a realistic incident response exercise: a GuardDuty finding shows an EC2 instance communicating with a C2 server. Contain it, capture forensic evidence, trace the attack in CloudTrail, and produce a post-incident report.
You'll learn: Incident response phases: Detect, Contain, Eradicate, Recover, Post-Incident
Prevent data exfiltration with GCP VPC Service Controls security perimeter
Define a VPC Service Controls perimeter around BigQuery and Cloud Storage: even users with valid IAM permissions cannot access these services from outside the perimeter, and data cannot be copied to external projects — closing the insider-threat exfiltration path.
You'll learn: VPC SC vs IAM: why a second perimeter layer is needed even with perfect IAM
Design SLOs and Configure Burn Rate Alerts in CloudWatch
Define a 99.9% availability SLO for an API, calculate error budgets, and configure fast-burn and slow-burn CloudWatch alarms to detect budget exhaustion before users notice.
You'll learn: SLI and SLO design from ALB metrics
Autoscale Kubernetes Pods Based on SQS Queue Depth with KEDA
Install KEDA on EKS, configure IAM roles for service accounts (IRSA), and create a ScaledObject that scales worker pods from zero to N based on SQS queue depth.
You'll learn: KEDA installation with Helm on EKS
Configure Route 53 Health Checks and Automated Failover
Set up Route 53 health checks on primary and secondary ALBs with automatic DNS failover, achieving sub-90-second recovery when the primary region becomes unhealthy.
You'll learn: Route 53 health checks for ALB endpoints
Sign Container Images with Cosign and Enforce Signatures in Kubernetes
Use Sigstore Cosign for keyless container image signing in GitHub Actions CI, then enforce signature verification with a Kyverno ClusterPolicy that blocks unsigned images.
You'll learn: Cosign keyless signing with GitHub OIDC in CI
Build Warm Standby DR with RDS Cross-Region Read Replica
Create an RDS cross-region read replica in us-west-2, monitor replica lag with CloudWatch alarms, and practice the full promotion procedure to validate your RTO.
You'll learn: RDS cross-region read replica creation
Execute Zero-Downtime Database Schema Migration with the Expand-Contract Pattern
Use the expand-contract (parallel change) pattern to safely rename a PostgreSQL column across a distributed system without downtime, locks, or data loss.
You'll learn: Expand phase: adding new column without table lock