Back
Interactive Explainer

Kubernetes Audit Logging: Who Did What, When

Kubernetes audit logs record every API request including who made it, what they did, and when. Without audit logging, security incidents are unrecoverable -- you cannot determine what an attacker accessed or modified. Audit logging is the forensic foundation of Kubernetes security.

Relevant for:SeniorStaff
Why this matters at your level
Senior

Enable kube-apiserver audit logging. Configure an AuditPolicy to capture relevant events without excessive noise. Ship logs to a SIEM. Know what the four audit stages are (RequestReceived, ResponseStarted, ResponseComplete, Panic).

Staff

Design audit log retention policy (compliance: often 1 year minimum). Set up alerting on high-value events (Secret reads, ClusterRoleBinding creates, privileged pod creates). Integrate with SOC workflows for tier-1 security response.

Kubernetes Audit Logging: Who Did What, When

Kubernetes audit logs record every API request including who made it, what they did, and when. Without audit logging, security incidents are unrecoverable -- you cannot determine what an attacker accessed or modified. Audit logging is the forensic foundation of Kubernetes security.

~3 min read
Be the first to complete!
LIVEBreach With No Forensic Trail -- Missing Audit Logs -- 2020
T-3w

Attacker enters cluster; no audit logs to record initial access method

T+0

Anomalous AWS charges detected; incident response begins

T+1d

Investigation finds: no audit logs; cannot determine scope of breach

T+3d

Decision: treat entire cluster as compromised; full rebuild begins

T+2w

Full cluster rebuild and credential rotation complete; estimated $150k response cost

Attacker active with zero forensic trail
Incident response cost (vs pennies for log storage)
Cluster rebuilt as worst-case assumption

The question this raises

What does Kubernetes audit logging capture, and what is unrecoverable if audit logs are absent during a security incident?

Test your assumption first

A pod was deleted at 3 AM and you need to determine who deleted it. Audit logging is enabled at Metadata level for pod deletions. What information is available in the audit log?

Lesson outline

What Audit Logging Solves

No Audit Logs = No Forensics

Without audit logs, a security incident becomes unrecoverable: you cannot determine initial access method, what an attacker read or modified, or what persistent backdoors were left. Audit logging is the difference between a targeted incident response and rebuilding the entire cluster from scratch.

AuditPolicy: tiered approach

Use for: Metadata level for most resources (low volume, high forensic value). Request level for RBAC changes and pod creates (captures intent). None for high-volume read paths (metric scrapers, controllers polling status).

Out-of-band log shipping

Use for: Ship audit logs to external SIEM or immutable log store immediately. Never rely on logs stored on the API server node -- attacker with node access can delete them. CloudWatch, Splunk, Elasticsearch with S3 backup.

High-value event alerting

Use for: Alert immediately (not batch) on: ClusterRoleBinding creates/updates, Secret reads in production namespaces, privileged pod creates, node shell commands. These are early indicators of attack progression.

The System View: Audit Log Flow

API Request: GET /api/v1/namespaces/prod/secrets/db-password
         |
         v  kube-apiserver audit filter
AuditPolicy rule: secrets -> level: Request
         |
         v  AuditEvent generated:
{
  "verb": "get",
  "user": {"username": "system:serviceaccount:prod:attacker-sa"},
  "objectRef": {"resource": "secrets", "name": "db-password", "namespace": "prod"},
  "sourceIPs": ["10.0.1.5"],
  "responseStatus": {"code": 200},
  "requestReceivedTimestamp": "2023-03-15T02:14:33Z",
  "stageTimestamp": "2023-03-15T02:14:33.001Z",
  "stage": "ResponseComplete"
}
         |
         v  Fluent Bit DaemonSet on API server node
         v  -> Elasticsearch (SIEM)
         v  -> S3 (immutable archive, 1-year retention)
         |
         v  Alert: Secret read in prod namespace -> PagerDuty

Every API request generates an audit event; ship immediately to out-of-band immutable store before attacker can delete on-node logs

Audit Policy Design

Situation
Before
After

Full RequestResponse on all resources

Secret values appear in response body of logs; encryption at rest moot if logs contain plaintext; 10TB/day of log volume

Metadata for most; Request for RBAC/pod creates; RequestResponse for nothing (Secret values never logged); 50GB/day manageable

Audit logs only on API server node

Attacker with node access deletes audit logs; incident response blind; cannot prove scope of breach

Fluent Bit ships logs within 30 seconds to external SIEM; S3 object lock prevents deletion; forensic trail preserved even if node compromised

How AuditPolicy Works

Designing a tiered audit policy

1

1. Identify high-value resources: secrets, clusterrolebindings, pods, nodes, serviceaccounts

2

2. Identify noisy low-value paths: GET/WATCH on configmaps/status/leases by controllers

3

3. Write rules: None for controller watch loops, Metadata for normal CRUD, Request for RBAC changes

4

4. Apply policy via kube-apiserver --audit-policy-file flag

5

5. Monitor log volume; adjust None rules for paths generating > 10% of total audit events

6

6. Set up alerts on Secret reads, ClusterRoleBinding creates, privileged pod specs

audit-policy.yaml
1apiVersion: audit.k8s.io/v1
2kind: Policy
3rules:
4# Log secret access at Request level (no response body = no values)
5- level: Request
Request level for secrets: logs who read them without exposing values in response body
6 resources:
7 - group: ""
8 resources: ["secrets"]
9
10# Log RBAC changes at Request level
11- level: Request
12 resources:
13 - group: "rbac.authorization.k8s.io"
14 resources: ["clusterrolebindings", "rolebindings", "clusterroles", "roles"]
15
None for controller watch loops: kube-controller-manager polls constantly; would dominate audit volume
16# Skip noisy controller watch loops
17- level: None
18 users: ["system:kube-controller-manager"]
19 verbs: ["watch", "list"]
20 resources:
21 - group: ""
22 resources: ["pods", "services", "endpoints"]
23
24# Default: Metadata for everything else
25- level: Metadata

What Breaks in Production: Blast Radius

Audit logging failure modes

  • API server performance impactRequestResponse level on all resources can slow API server response time by 15-20% and generate huge log volumes. Always profile log volume and API latency after changing audit policy. Start with Metadata-only, add Request for specific resources.
  • Logs not shipped out-of-bandAudit logs stored only on API server nodes. Node compromise = log destruction. Ship within 30 seconds to external SIEM. Use Fluent Bit DaemonSet on control plane nodes with direct SIEM integration.
  • Secret values in audit response bodyRequestResponse level for secrets logs the decrypted Secret values in the response body. This defeats all encryption at rest. Never use RequestResponse for secrets. Use Request level (logs who reads it, not what value).
  • No alerting -- logs collected but not monitoredAudit logs in SIEM but no alerts configured. Attack happens; logs record it faithfully; nobody sees it for 3 weeks (real incident timeline). Configure real-time alerts on ClusterRoleBinding creates, Secret reads, privileged pod creates.

RequestResponse level for secrets logs credential values in plaintext

Bug
# DANGEROUS: logs Secret VALUES in response body
rules:
- level: RequestResponse
  resources:
  - group: ""
    resources: ["secrets"]
# Every kubectl get secret now logs:
# responseObject.data.password = "aHVudGVyMg=="
# base64 decoded = "hunter2"
# All credential values in your SIEM/S3 logs
Fix
# Safe: logs who accessed secrets, not their values
rules:
- level: Request
  resources:
  - group: ""
    resources: ["secrets"]
# Logs: user, IP, verb, secret name, timestamp
# Does NOT log responseObject (no values exposed)

Request level logs the requestObject (what the user sent) but not responseObject (what the API server returned -- the secret values). Use Request level for secrets: forensically useful (who read it) without exposing credentials.

Decision Guide: Audit Policy Levels

Is this a security-sensitive resource (secrets, RBAC, pod specs)?
YesRequest level: logs who accessed it and what they sent; do not use RequestResponse for secrets (values logged)
NoMetadata level: logs who accessed what without request/response body; low volume, high forensic value
Is this a high-frequency controller watch loop?
YesNone level: controllers watch constantly; including them floods logs with no forensic value
NoMetadata level minimum for all other resources
Do you need real-time security alerts?
YesSet up SIEM alerts on: ClusterRoleBinding creates, Secret reads in prod, privileged pod creates; alert within 5 minutes
NoAt minimum, retain logs for 1 year for incident response; search on demand

Cost and Complexity: Audit Level Comparison

LevelWhat is loggedLog volumeForensic valueUse for
NoneNothingZeroNoneHigh-frequency controller watches
MetadataUser, resource, verb, IP, timestampLowHighMost resources (default)
RequestMetadata + request bodyMediumVery highRBAC changes, pod creates
RequestResponseMetadata + request + responseVery highHighest (+ security risk for secrets)Non-sensitive debugging only

Exam Answer vs. Production Reality

1 / 3

What audit logs capture

📖 What the exam expects

Every API request: verb (get/create/delete), resource (pods/secrets/configmaps), namespace, user/serviceaccount, source IP, request/response body (depending on audit level), timestamp.

Toggle between what certifications teach and what production actually requires

How this might come up in interviews

Security and compliance questions about forensic capability and incident response readiness.

Common questions:

  • What does Kubernetes audit logging capture?
  • What are the four audit log levels?
  • How would you design an audit policy that is forensically useful without generating excessive noise?
  • Why must audit logs be shipped out-of-band from the cluster?

Strong answer: Mentions tiered AuditPolicy (different levels for different resources), out-of-band log shipping to immutable store, and alerting on high-value events (cluster-admin bindings, Secret reads).

Red flags: Disabling audit logging to save storage, or not knowing the audit policy levels.

Suggested next

Often learned after this topic.

Cluster Hardening & CIS Benchmark

Ready to see how this works in the cloud?

Switch to Career Paths for structured paths (e.g. Developer, DevOps) and provider-specific lessons.

View role-based paths

Sign in to track your progress and mark lessons complete.

Discussion

Questions? Discuss in the community or start a thread below.

Join Discord

In-app Q&A

Sign in to start or join a thread.