Interactive Explainer

🎯Key Takeaways

☁️ Misconfiguration is the #1 cloud breach cause — not sophisticated attacks. A public S3 bucket or over-permissive IAM role is all it takes

📋 CIS Benchmarks are your starting hardening checklist — automated tools (Prowler, Scout Suite) scan cloud accounts against them in minutes

🔒 IaC + policy as code = the only scalable approach: config in version control, policies enforced before apply, continuous checks for drift

🚫 Never use the console for production changes — every console edit is an unreviewed, unaudited, unreverted risk

⚡ Detect drift within hours, not months — continuous compliance tools (AWS Config, Cloud Custodian) run constantly and alert on violations

🎯 Least privilege is a configuration property, not a code property — IAM roles, security groups, and RBAC are configuration decisions made at deploy time

Secure Configuration

Hardening infrastructure and application config — using IaC, policy as code, and continuous compliance — so default and drifted settings never create exploitable risk.

~9 min read

Be the first to complete!

What you'll learn

☁️ Misconfiguration is the #1 cloud breach cause — not sophisticated attacks. A public S3 bucket or over-permissive IAM role is all it takes
📋 CIS Benchmarks are your starting hardening checklist — automated tools (Prowler, Scout Suite) scan cloud accounts against them in minutes
🔒 IaC + policy as code = the only scalable approach: config in version control, policies enforced before apply, continuous checks for drift
🚫 Never use the console for production changes — every console edit is an unreviewed, unaudited, unreverted risk
⚡ Detect drift within hours, not months — continuous compliance tools (AWS Config, Cloud Custodian) run constantly and alert on violations
🎯 Least privilege is a configuration property, not a code property — IAM roles, security groups, and RBAC are configuration decisions made at deploy time

Lesson outline

Why this matters

Why configuration is the biggest attack surface you're probably ignoring

Every year, the OWASP Top 10 includes "Security Misconfiguration" as a top cause of breaches — consistently in the top 5. Verizon's Data Breach Investigations Report finds that misconfiguration is involved in a majority of cloud breaches. And yet most development teams invest heavily in code security (SAST, code review, secure coding) while their infrastructure configuration is managed by whoever has console access and a deadline.

Misconfiguration facts

Gartner: through 2025, 99% of cloud security failures will be the customer's fault — almost all misconfiguration, not platform vulnerabilities. Capital One: $80M fine, 100M+ records. Twitch: 125GB leak. Facebook: 540M records in a public MongoDB. None of these required a sophisticated zero-day exploit — they required finding a misconfigured service.

Configuration security is not just about one setting — it is about every layer of your stack: cloud IAM and network rules, container security context and network policies, application environment variables, and infrastructure-as-code definitions. Each layer has its own failure modes. Use the explorer below to see the most common misconfigurations at each layer and how to prevent them.

Secure Configuration: Layer Explorer

Select a layer to explore real misconfiguration risks and how to fix them

☁️ Cloud / IAM Configuration

AWS IAM rolesS3 bucket policiesSecurity groupsVPC config

Common misconfigurations — click to explore

Public S3 bucket

critical

⚠ Real-world example

Capital One breach (2019): misconfigured WAF role allowed SSRF to access S3 buckets containing 100M customer records.

✅ How to fix it

Enable S3 Block Public Access at account level. Enforce via aws_s3_bucket_public_access_block in Terraform + OPA policy.

The three defenses

Prevent

IaC policy scan in CI before apply

Detect

Continuous compliance (AWS Config, Cloud Custodian)

Remediate

Auto-remediate or alert + ticket within SLA

The three pillars: prevent, detect, remediate

Secure configuration at scale requires three mutually reinforcing controls:

The three-pillar model

Prevent — IaC + policy as code — Define all configuration in version-controlled IaC (Terraform, CloudFormation). Every change goes through PR review. Policy scanning (Checkov, tfsec, OPA/Conftest) enforces rules before terraform apply — blocking non-compliant config before it exists
Detect — continuous compliance — AWS Config, Cloud Custodian, or Wiz scan live cloud resources continuously against policy rules. Detects drift from IaC (console changes), new resources that bypassed the pipeline, and policy violations that evolved over time
Remediate — alert + auto-fix — When drift is detected: high-severity findings auto-remediate (e.g., auto-close public S3 bucket). Lower severity creates tickets with SLA. The rule: detection without remediation is just documentation of failure

The console access question

Should engineers have console access to production? The answer should be "emergency only, with audit trail, with a process to revert changes to IaC within 24 hours." Console access for routine work is the primary cause of configuration drift. Session Manager + IaC-only changes + break-glass procedures for emergencies is the mature posture.

CIS Benchmarks: the industry standard hardening baseline

The Center for Internet Security (CIS) publishes hardening benchmarks for cloud platforms, operating systems, containers, and applications. These are consensus standards developed with hundreds of security experts and represent the minimum security baseline for each technology.

CIS Benchmark	Coverage	Automated scanner
CIS AWS Foundations	IAM, S3, CloudTrail, VPC, KMS, monitoring	Prowler, AWS Security Hub
CIS Kubernetes	API server, etcd, kubelet, network policies, RBAC	kube-bench
CIS Docker	Host config, daemon config, container runtime, images	Docker Bench for Security
CIS Linux (Ubuntu/RHEL)	Filesystem, services, network, logging, access	CIS-CAT, Lynis
CIS GCP	IAM, cloud storage, logging, networking, SQL	Forseti, Cloud Security Command Center
CIS Azure	IAM, storage, database, networking, monitoring	Azure Security Center, Prowler

Run these benchmarks against your environments on initial setup and continuously thereafter. They provide a prioritized, evidence-based starting point for hardening — much better than starting from scratch.

Start with CIS Level 1 — it is the 80/20

CIS benchmarks have Level 1 (essential, low-operational-impact) and Level 2 (comprehensive, may affect functionality). Level 1 covers 80% of common misconfiguration risk with minimal operational disruption. Implement Level 1 first for all environments. Level 2 for environments handling sensitive data (PCI, HIPAA scope).

scripts/run-prowler.sh

1#!/bin/bash
2# Run Prowler CIS AWS Foundations benchmark against your AWS account
3# Requires: AWS credentials with SecurityAudit + ViewOnlyAccess permissions
4# Install: pip install prowler
5 
6# Run full CIS Level 1 check
7prowler aws \
8  --compliance cis_level1_aws \
9  --output-formats html json csv \
10  --output-directory ./security-reports/prowler \
11  --severity critical high
12 
13# Run specific CIS check group (e.g., IAM only)
14prowler aws \
15  --compliance cis_level1_aws \
16  --group iam \
17  --output-formats json
18 
19# Run against multiple accounts (with assume-role)
20for ACCOUNT_ID in 123456789 987654321 456789123; do
21  prowler aws \
22    --role arn:aws:iam::${ACCOUNT_ID}:role/ProwlerAuditRole \
23    --compliance cis_level1_aws \
24    --output-formats json \
25    --output-directory ./security-reports/${ACCOUNT_ID}
26done
27 
28# Schedule in CI (weekly baseline report):
29# 0 0 * * 1 /scripts/run-prowler.sh > /var/log/prowler-$(date +%Y%m%d).log

Kubernetes secure configuration: the admission controller pattern

In Kubernetes, configuration security is enforced at admission time — when a resource is created or updated. Admission controllers (OPA Gatekeeper, Kyverno) evaluate every manifest against policy before it is applied to the cluster.

Essential Kubernetes security configuration checks

No root containers — securityContext.runAsNonRoot: true and runAsUser >= 1000. Root containers can escape container boundaries more easily via kernel exploits
Read-only root filesystem — securityContext.readOnlyRootFilesystem: true. Prevents an attacker who gets code execution from writing files, installing tools, or modifying configs
No privileged containers — securityContext.privileged: false. Privileged containers have near-host-level access — essentially root on the node
No hostPath mounts — Mounting host directories into containers can expose sensitive host files (/etc/passwd, /var/log, SSH keys)
Network policies (default deny) — Default-deny NetworkPolicy isolates pods — a compromised pod cannot reach the database unless a policy explicitly allows it
Resource limits — CPU and memory limits prevent container escape via resource exhaustion (DoS within the cluster)

k8s/policy/restrict-privileged.yaml

1# Kyverno policy: enforce container security best practices
2# Applied at admission time — pods violating policy are rejected
3apiVersion: kyverno.io/v1
4kind: ClusterPolicy
5metadata:
6  name: restrict-pod-security
7  annotations:
8    policies.kyverno.io/title: Restrict Pod Security
9    policies.kyverno.io/severity: high
10    policies.kyverno.io/description: >-
11      Enforce security context requirements for all pods.
12      Pods running as root or privileged are rejected.
13spec:
14  validationFailureAction: Enforce  # Reject non-compliant pods (use Audit to dry-run first)
15  background: true
16  rules:
17    - name: restrict-privileged
18      match:
19        any:
20          - resources:
21              kinds: [Pod]
22      validate:
23        message: "Privileged containers are not allowed."
24        pattern:
25          spec:
26            containers:
27              - =(securityContext):
28                  =(privileged): false
29 
30    - name: require-non-root
31      match:
32        any:
33          - resources:
34              kinds: [Pod]
35      validate:
36        message: "Containers must not run as root. Set runAsNonRoot: true."
37        pattern:
38          spec:
39            securityContext:
40              runAsNonRoot: true
41            containers:
42              - =(securityContext):
43                  =(runAsUser): ">0"
44 
45    - name: require-read-only-root
46      match:
47        any:
48          - resources:
49              kinds: [Pod]
50      validate:
51        message: "Root filesystem must be read-only."
52        pattern:
53          spec:
54            containers:
55              - securityContext:
56                  readOnlyRootFilesystem: true

Detecting and responding to configuration drift

Even with perfect IaC and policy scanning, configuration drift happens: emergency console changes, third-party tools that modify config, or AWS service updates that change default behaviors. Continuous compliance is the detection layer.

Continuous compliance architecture

→

Define policy rules as code (AWS Config Rules, Cloud Custodian policies, or OPA policies)

→

Schedule evaluation: AWS Config evaluates every resource change in real-time; Cloud Custodian can run on a schedule or event-driven

→

Alert on violations: high-severity findings go to PagerDuty or Slack security channel immediately; medium/low create JIRA tickets

→

Auto-remediate safe fixes: AWS Config remediation actions or Cloud Custodian actions can auto-fix deterministic violations (re-enable S3 block public access, remove public EC2 IP) — but only for changes with no service impact

→

Track as security debt: medium/low findings that cannot be auto-remediated go into the security backlog with owner and SLA

Report for compliance: generate compliance reports from AWS Config or Cloud Custodian output for SOC 2, HIPAA, or internal audits

Define policy rules as code (AWS Config Rules, Cloud Custodian policies, or OPA policies)

Schedule evaluation: AWS Config evaluates every resource change in real-time; Cloud Custodian can run on a schedule or event-driven

Alert on violations: high-severity findings go to PagerDuty or Slack security channel immediately; medium/low create JIRA tickets

Track as security debt: medium/low findings that cannot be auto-remediated go into the security backlog with owner and SLA

Report for compliance: generate compliance reports from AWS Config or Cloud Custodian output for SOC 2, HIPAA, or internal audits

AWS Config vs Cloud Custodian: when to use each

AWS Config excels at AWS-native resource evaluation with managed rules and native remediation. Cloud Custodian is more flexible — supports multi-cloud (AWS, GCP, Azure), has richer action capabilities (email, Slack, tag, stop, quarantine), and can be version-controlled alongside your IaC. Use AWS Config for AWS-native compliance reporting; Cloud Custodian for multi-cloud or complex remediation workflows.

How this might come up in interviews

Cloud Security Engineer, DevSecOps Engineer, and Platform Engineering roles. Often in "design a secure cloud account baseline" questions or "how would you enforce least privilege at scale?" system design prompts.

Common questions:

What is the difference between configuration management and secure configuration?
How would you prevent a developer from making an ad-hoc console change to a production security group?
Explain the Capital One breach at a technical level. What single configuration change would have prevented it?
What are CIS Benchmarks and how do you use them in practice?
Walk me through how you would enforce "no public S3 buckets" across 50 AWS accounts at an organization.
What is configuration drift and how do you detect and remediate it automatically?
What tools would you use to scan Kubernetes manifests for security misconfigurations before deployment?

Strong answer: Mentions IMDS v2 when discussing AWS instance security. Knows that Checkov / tfsec run in CI before terraform apply. Can explain OPA Rego policy syntax conceptually. Talks about golden AMIs or base Terraform modules as a way to bake in security defaults. Mentions CIS Benchmarks without prompting.

Red flags: Thinks "secure configuration" means having a security team review configs manually. Cannot name any IaC policy scanning tool. Has no answer for how to handle configuration drift. Thinks least privilege is a code concern rather than a configuration concern.

Quick check · Secure Configuration

1 / 4

The Capital One breach (2019) exposed 100M+ customer records. The root technical cause was:

Key takeaways

☁️ Misconfiguration is the #1 cloud breach cause — not sophisticated attacks. A public S3 bucket or over-permissive IAM role is all it takes
📋 CIS Benchmarks are your starting hardening checklist — automated tools (Prowler, Scout Suite) scan cloud accounts against them in minutes
🔒 IaC + policy as code = the only scalable approach: config in version control, policies enforced before apply, continuous checks for drift
🚫 Never use the console for production changes — every console edit is an unreviewed, unaudited, unreverted risk
⚡ Detect drift within hours, not months — continuous compliance tools (AWS Config, Cloud Custodian) run constantly and alert on violations
🎯 Least privilege is a configuration property, not a code property — IAM roles, security groups, and RBAC are configuration decisions made at deploy time

Before you move on: can you answer these?

What is configuration drift and why is it a security risk?

Configuration drift is when the actual state of a system diverges from its intended/documented state — usually through manual changes (console edits, SSH and fix) that bypass the IaC-managed config. It's a security risk because drifted configs may introduce vulnerabilities (opening a security group, disabling encryption) that don't appear in the IaC code and may go undetected for months.

Why is "least privilege" considered a configuration property, not a code property?

Least privilege is about what a system or service is allowed to do — defined in IAM roles, security groups, RBAC, network policies — not in application code. A perfectly written application can be compromised if it runs with an over-permissive IAM role that allows it (or an attacker exploiting it) to access all S3 buckets or assume admin. Configuration defines the blast radius; code determines the attack surface.

What is the difference between Checkov/tfsec (IaC scanner) and AWS Config (continuous compliance)?

Checkov/tfsec scan Terraform or CloudFormation code in CI before resources are created — they prevent misconfigurations from ever reaching production. AWS Config evaluates live cloud resources continuously against rules — it detects misconfigurations that slipped through (console changes, API calls, drift) and alerts or auto-remediates. You need both: prevent during development, detect after deploy.

From the books

DevSecOps: A Leader's Guide to Producing Secure Software

Chapter 7: Securing the Infrastructure Layer

The book emphasizes that configuration security is a "force multiplier" — getting it right protects everything running on the infrastructure, regardless of code-level vulnerabilities. It recommends treating infrastructure configuration with the same code review standards as application code: pull request, automated scan, peer review, automated test before merge.

🧠Mental Model

💡 Analogy

Configuration is the locks, windows, and alarm system of your house — and misconfiguration is accidentally leaving the back door open. The technology (the house) can be excellent, but if the lock is set to "anyone can enter," the house is not secure. The key insight: most configuration security isn't about clever technical controls — it's about systematically ensuring defaults are safe, changes are reviewed, and drift is detected and corrected quickly.

⚡ Core Idea

Secure configuration has three layers: prevent (define correct config in IaC, enforce with policy-as-code before apply), detect (continuous compliance scans find what slipped through), and remediate (alert or auto-correct within a defined SLA). Misconfiguration is the #1 cause of cloud breaches not because the tools are bad, but because manually managing configuration at scale without automation is impossible — defaults get left on, quick fixes get made and forgotten.

🎯 Why It Matters

Gartner predicts that through 2025, 99% of cloud security failures will be the customer's fault — almost all due to misconfiguration, not sophisticated attacks. Capital One ($80M fine, 100M records), Twitch (125GB source code leak), Facebook (540M records in a public MongoDB) — these weren't zero-days. They were misconfigurations that an automated policy check would have caught.

Ready to see how this works in the cloud?

Switch to Career Paths for structured paths (e.g. Developer, DevOps) and provider-specific lessons.

View role-based paths

Discussion

Questions? Discuss in the community or start a thread below.

Join Discord

Secure Configuration

Secure Configuration

Why configuration is the biggest attack surface you're probably ignoring

Secure Configuration: Layer Explorer

☁️ Cloud / IAM Configuration

Public S3 bucket

The three pillars: prevent, detect, remediate

CIS Benchmarks: the industry standard hardening baseline

Kubernetes secure configuration: the admission controller pattern

Detecting and responding to configuration drift

The Capital One breach (2019) exposed 100M+ customer records. The root technical cause was:

Discussion

In-app Q&A