Secure Configuration
Hardening infrastructure and application config — using IaC, policy as code, and continuous compliance — so default and drifted settings never create exploitable risk.
Secure Configuration
Hardening infrastructure and application config — using IaC, policy as code, and continuous compliance — so default and drifted settings never create exploitable risk.
What you'll learn
- ☁️ Misconfiguration is the #1 cloud breach cause — not sophisticated attacks. A public S3 bucket or over-permissive IAM role is all it takes
- 📋 CIS Benchmarks are your starting hardening checklist — automated tools (Prowler, Scout Suite) scan cloud accounts against them in minutes
- 🔒 IaC + policy as code = the only scalable approach: config in version control, policies enforced before apply, continuous checks for drift
- 🚫 Never use the console for production changes — every console edit is an unreviewed, unaudited, unreverted risk
- ⚡ Detect drift within hours, not months — continuous compliance tools (AWS Config, Cloud Custodian) run constantly and alert on violations
- 🎯 Least privilege is a configuration property, not a code property — IAM roles, security groups, and RBAC are configuration decisions made at deploy time
Lesson outline
Why configuration is the biggest attack surface you're probably ignoring
Every year, the OWASP Top 10 includes "Security Misconfiguration" as a top cause of breaches — consistently in the top 5. Verizon's Data Breach Investigations Report finds that misconfiguration is involved in a majority of cloud breaches. And yet most development teams invest heavily in code security (SAST, code review, secure coding) while their infrastructure configuration is managed by whoever has console access and a deadline.
Misconfiguration facts
Gartner: through 2025, 99% of cloud security failures will be the customer's fault — almost all misconfiguration, not platform vulnerabilities. Capital One: $80M fine, 100M+ records. Twitch: 125GB leak. Facebook: 540M records in a public MongoDB. None of these required a sophisticated zero-day exploit — they required finding a misconfigured service.
Configuration security is not just about one setting — it is about every layer of your stack: cloud IAM and network rules, container security context and network policies, application environment variables, and infrastructure-as-code definitions. Each layer has its own failure modes. Use the explorer below to see the most common misconfigurations at each layer and how to prevent them.
Secure Configuration: Layer Explorer
Select a layer to explore real misconfiguration risks and how to fix them
☁️ Cloud / IAM Configuration
Common misconfigurations — click to explore
Public S3 bucket
critical⚠ Real-world example
Capital One breach (2019): misconfigured WAF role allowed SSRF to access S3 buckets containing 100M customer records.
✅ How to fix it
Enable S3 Block Public Access at account level. Enforce via aws_s3_bucket_public_access_block in Terraform + OPA policy.
The three defenses
The three pillars: prevent, detect, remediate
Secure configuration at scale requires three mutually reinforcing controls:
The three-pillar model
- Prevent — IaC + policy as code — Define all configuration in version-controlled IaC (Terraform, CloudFormation). Every change goes through PR review. Policy scanning (Checkov, tfsec, OPA/Conftest) enforces rules before
terraform apply— blocking non-compliant config before it exists - Detect — continuous compliance — AWS Config, Cloud Custodian, or Wiz scan live cloud resources continuously against policy rules. Detects drift from IaC (console changes), new resources that bypassed the pipeline, and policy violations that evolved over time
- Remediate — alert + auto-fix — When drift is detected: high-severity findings auto-remediate (e.g., auto-close public S3 bucket). Lower severity creates tickets with SLA. The rule: detection without remediation is just documentation of failure
The console access question
Should engineers have console access to production? The answer should be "emergency only, with audit trail, with a process to revert changes to IaC within 24 hours." Console access for routine work is the primary cause of configuration drift. Session Manager + IaC-only changes + break-glass procedures for emergencies is the mature posture.
CIS Benchmarks: the industry standard hardening baseline
The Center for Internet Security (CIS) publishes hardening benchmarks for cloud platforms, operating systems, containers, and applications. These are consensus standards developed with hundreds of security experts and represent the minimum security baseline for each technology.
| CIS Benchmark | Coverage | Automated scanner |
|---|---|---|
| CIS AWS Foundations | IAM, S3, CloudTrail, VPC, KMS, monitoring | Prowler, AWS Security Hub |
| CIS Kubernetes | API server, etcd, kubelet, network policies, RBAC | kube-bench |
| CIS Docker | Host config, daemon config, container runtime, images | Docker Bench for Security |
| CIS Linux (Ubuntu/RHEL) | Filesystem, services, network, logging, access | CIS-CAT, Lynis |
| CIS GCP | IAM, cloud storage, logging, networking, SQL | Forseti, Cloud Security Command Center |
| CIS Azure | IAM, storage, database, networking, monitoring | Azure Security Center, Prowler |
Run these benchmarks against your environments on initial setup and continuously thereafter. They provide a prioritized, evidence-based starting point for hardening — much better than starting from scratch.
Start with CIS Level 1 — it is the 80/20
CIS benchmarks have Level 1 (essential, low-operational-impact) and Level 2 (comprehensive, may affect functionality). Level 1 covers 80% of common misconfiguration risk with minimal operational disruption. Implement Level 1 first for all environments. Level 2 for environments handling sensitive data (PCI, HIPAA scope).
1#!/bin/bash2# Run Prowler CIS AWS Foundations benchmark against your AWS account3# Requires: AWS credentials with SecurityAudit + ViewOnlyAccess permissions4# Install: pip install prowler56# Run full CIS Level 1 check7prowler aws \8--compliance cis_level1_aws \9--output-formats html json csv \10--output-directory ./security-reports/prowler \11--severity critical high1213# Run specific CIS check group (e.g., IAM only)14prowler aws \15--compliance cis_level1_aws \16--group iam \17--output-formats json1819# Run against multiple accounts (with assume-role)20for ACCOUNT_ID in 123456789 987654321 456789123; do21prowler aws \22--role arn:aws:iam::${ACCOUNT_ID}:role/ProwlerAuditRole \23--compliance cis_level1_aws \24--output-formats json \25--output-directory ./security-reports/${ACCOUNT_ID}26done2728# Schedule in CI (weekly baseline report):29# 0 0 * * 1 /scripts/run-prowler.sh > /var/log/prowler-$(date +%Y%m%d).log
Kubernetes secure configuration: the admission controller pattern
In Kubernetes, configuration security is enforced at admission time — when a resource is created or updated. Admission controllers (OPA Gatekeeper, Kyverno) evaluate every manifest against policy before it is applied to the cluster.
Essential Kubernetes security configuration checks
- No root containers — securityContext.runAsNonRoot: true and runAsUser >= 1000. Root containers can escape container boundaries more easily via kernel exploits
- Read-only root filesystem — securityContext.readOnlyRootFilesystem: true. Prevents an attacker who gets code execution from writing files, installing tools, or modifying configs
- No privileged containers — securityContext.privileged: false. Privileged containers have near-host-level access — essentially root on the node
- No hostPath mounts — Mounting host directories into containers can expose sensitive host files (/etc/passwd, /var/log, SSH keys)
- Network policies (default deny) — Default-deny NetworkPolicy isolates pods — a compromised pod cannot reach the database unless a policy explicitly allows it
- Resource limits — CPU and memory limits prevent container escape via resource exhaustion (DoS within the cluster)
1# Kyverno policy: enforce container security best practices2# Applied at admission time — pods violating policy are rejected3apiVersion: kyverno.io/v14kind: ClusterPolicy5metadata:6name: restrict-pod-security7annotations:8policies.kyverno.io/title: Restrict Pod Security9policies.kyverno.io/severity: high10policies.kyverno.io/description: >-11Enforce security context requirements for all pods.12Pods running as root or privileged are rejected.13spec:14validationFailureAction: Enforce # Reject non-compliant pods (use Audit to dry-run first)15background: true16rules:17- name: restrict-privileged18match:19any:20- resources:21kinds: [Pod]22validate:23message: "Privileged containers are not allowed."24pattern:25spec:26containers:27- =(securityContext):28=(privileged): false2930- name: require-non-root31match:32any:33- resources:34kinds: [Pod]35validate:36message: "Containers must not run as root. Set runAsNonRoot: true."37pattern:38spec:39securityContext:40runAsNonRoot: true41containers:42- =(securityContext):43=(runAsUser): ">0"4445- name: require-read-only-root46match:47any:48- resources:49kinds: [Pod]50validate:51message: "Root filesystem must be read-only."52pattern:53spec:54containers:55- securityContext:56readOnlyRootFilesystem: true
Detecting and responding to configuration drift
Even with perfect IaC and policy scanning, configuration drift happens: emergency console changes, third-party tools that modify config, or AWS service updates that change default behaviors. Continuous compliance is the detection layer.
Continuous compliance architecture
01
Define policy rules as code (AWS Config Rules, Cloud Custodian policies, or OPA policies)
02
Schedule evaluation: AWS Config evaluates every resource change in real-time; Cloud Custodian can run on a schedule or event-driven
03
Alert on violations: high-severity findings go to PagerDuty or Slack security channel immediately; medium/low create JIRA tickets
04
Auto-remediate safe fixes: AWS Config remediation actions or Cloud Custodian actions can auto-fix deterministic violations (re-enable S3 block public access, remove public EC2 IP) — but only for changes with no service impact
05
Track as security debt: medium/low findings that cannot be auto-remediated go into the security backlog with owner and SLA
06
Report for compliance: generate compliance reports from AWS Config or Cloud Custodian output for SOC 2, HIPAA, or internal audits
Define policy rules as code (AWS Config Rules, Cloud Custodian policies, or OPA policies)
Schedule evaluation: AWS Config evaluates every resource change in real-time; Cloud Custodian can run on a schedule or event-driven
Alert on violations: high-severity findings go to PagerDuty or Slack security channel immediately; medium/low create JIRA tickets
Auto-remediate safe fixes: AWS Config remediation actions or Cloud Custodian actions can auto-fix deterministic violations (re-enable S3 block public access, remove public EC2 IP) — but only for changes with no service impact
Track as security debt: medium/low findings that cannot be auto-remediated go into the security backlog with owner and SLA
Report for compliance: generate compliance reports from AWS Config or Cloud Custodian output for SOC 2, HIPAA, or internal audits
AWS Config vs Cloud Custodian: when to use each
AWS Config excels at AWS-native resource evaluation with managed rules and native remediation. Cloud Custodian is more flexible — supports multi-cloud (AWS, GCP, Azure), has richer action capabilities (email, Slack, tag, stop, quarantine), and can be version-controlled alongside your IaC. Use AWS Config for AWS-native compliance reporting; Cloud Custodian for multi-cloud or complex remediation workflows.
How this might come up in interviews
Cloud Security Engineer, DevSecOps Engineer, and Platform Engineering roles. Often in "design a secure cloud account baseline" questions or "how would you enforce least privilege at scale?" system design prompts.
Common questions:
- What is the difference between configuration management and secure configuration?
- How would you prevent a developer from making an ad-hoc console change to a production security group?
- Explain the Capital One breach at a technical level. What single configuration change would have prevented it?
- What are CIS Benchmarks and how do you use them in practice?
- Walk me through how you would enforce "no public S3 buckets" across 50 AWS accounts at an organization.
- What is configuration drift and how do you detect and remediate it automatically?
- What tools would you use to scan Kubernetes manifests for security misconfigurations before deployment?
Strong answer: Mentions IMDS v2 when discussing AWS instance security. Knows that Checkov / tfsec run in CI before terraform apply. Can explain OPA Rego policy syntax conceptually. Talks about golden AMIs or base Terraform modules as a way to bake in security defaults. Mentions CIS Benchmarks without prompting.
Red flags: Thinks "secure configuration" means having a security team review configs manually. Cannot name any IaC policy scanning tool. Has no answer for how to handle configuration drift. Thinks least privilege is a code concern rather than a configuration concern.
Quick check · Secure Configuration
1 / 4
The Capital One breach (2019) exposed 100M+ customer records. The root technical cause was:
Key takeaways
- ☁️ Misconfiguration is the #1 cloud breach cause — not sophisticated attacks. A public S3 bucket or over-permissive IAM role is all it takes
- 📋 CIS Benchmarks are your starting hardening checklist — automated tools (Prowler, Scout Suite) scan cloud accounts against them in minutes
- 🔒 IaC + policy as code = the only scalable approach: config in version control, policies enforced before apply, continuous checks for drift
- 🚫 Never use the console for production changes — every console edit is an unreviewed, unaudited, unreverted risk
- ⚡ Detect drift within hours, not months — continuous compliance tools (AWS Config, Cloud Custodian) run constantly and alert on violations
- 🎯 Least privilege is a configuration property, not a code property — IAM roles, security groups, and RBAC are configuration decisions made at deploy time
Before you move on: can you answer these?
What is configuration drift and why is it a security risk?
Configuration drift is when the actual state of a system diverges from its intended/documented state — usually through manual changes (console edits, SSH and fix) that bypass the IaC-managed config. It's a security risk because drifted configs may introduce vulnerabilities (opening a security group, disabling encryption) that don't appear in the IaC code and may go undetected for months.
Why is "least privilege" considered a configuration property, not a code property?
Least privilege is about what a system or service is allowed to do — defined in IAM roles, security groups, RBAC, network policies — not in application code. A perfectly written application can be compromised if it runs with an over-permissive IAM role that allows it (or an attacker exploiting it) to access all S3 buckets or assume admin. Configuration defines the blast radius; code determines the attack surface.
What is the difference between Checkov/tfsec (IaC scanner) and AWS Config (continuous compliance)?
Checkov/tfsec scan Terraform or CloudFormation code in CI before resources are created — they prevent misconfigurations from ever reaching production. AWS Config evaluates live cloud resources continuously against rules — it detects misconfigurations that slipped through (console changes, API calls, drift) and alerts or auto-remediates. You need both: prevent during development, detect after deploy.
From the books
DevSecOps: A Leader's Guide to Producing Secure Software
Chapter 7: Securing the Infrastructure Layer
The book emphasizes that configuration security is a "force multiplier" — getting it right protects everything running on the infrastructure, regardless of code-level vulnerabilities. It recommends treating infrastructure configuration with the same code review standards as application code: pull request, automated scan, peer review, automated test before merge.
💡 Analogy
Configuration is the locks, windows, and alarm system of your house — and misconfiguration is accidentally leaving the back door open. The technology (the house) can be excellent, but if the lock is set to "anyone can enter," the house is not secure. The key insight: most configuration security isn't about clever technical controls — it's about systematically ensuring defaults are safe, changes are reviewed, and drift is detected and corrected quickly.
⚡ Core Idea
Secure configuration has three layers: prevent (define correct config in IaC, enforce with policy-as-code before apply), detect (continuous compliance scans find what slipped through), and remediate (alert or auto-correct within a defined SLA). Misconfiguration is the #1 cause of cloud breaches not because the tools are bad, but because manually managing configuration at scale without automation is impossible — defaults get left on, quick fixes get made and forgotten.
🎯 Why It Matters
Gartner predicts that through 2025, 99% of cloud security failures will be the customer's fault — almost all due to misconfiguration, not sophisticated attacks. Capital One ($80M fine, 100M records), Twitch (125GB source code leak), Facebook (540M records in a public MongoDB) — these weren't zero-days. They were misconfigurations that an automated policy check would have caught.
Ready to see how this works in the cloud?
Switch to Career Paths for structured paths (e.g. Developer, DevOps) and provider-specific lessons.
View role-based pathsSign in to track your progress and mark lessons complete.
Discussion
Questions? Discuss in the community or start a thread below.
Join DiscordIn-app Q&A
Sign in to start or join a thread.