Runtime Security: Falco and Anomaly Detection
Runtime security detects attacks happening inside running containers -- after all other controls have been bypassed. Falco monitors kernel syscalls using eBPF to alert on container escapes, credential access, reverse shells, and privilege escalation in real time.
Why this matters at your level
Deploy Falco as a DaemonSet. Understand default rules (shell in container, sensitive file access, privilege escalation). Configure alert routing to SIEM/Slack. Tune rules to reduce false positives.
Design runtime security strategy: Falco for detection, automated pod quarantine via webhook on alert, integration with incident response playbooks, custom rules for your specific application threat model.
Runtime Security: Falco and Anomaly Detection
Runtime security detects attacks happening inside running containers -- after all other controls have been bypassed. Falco monitors kernel syscalls using eBPF to alert on container escapes, credential access, reverse shells, and privilege escalation in real time.
Command injection exploit spawns /bin/sh in running api-service container
Falco eBPF rule fires: "Terminal shell in container" alert sent to Slack
Attacker downloads miner binary via curl
Security team receives alert; investigates pod logs + Falco event stream
Pod terminated; RCE patched; analysis shows attacker had 10 minutes of access
The question this raises
If an attacker gets code execution inside your container, what visibility do you have -- and what is the detection latency before they can establish persistence?
Falco is deployed as a DaemonSet. A privileged container on node A runs nsenter to access the host network namespace. Will Falco on node A detect this?
Lesson outline
What Runtime Security Solves
The Detection Gap After Deployment
All preventive controls (RBAC, admission, image scanning) operate before the container runs. Once a container is running, if an attacker exploits a vulnerability inside it, no preventive control stops them. Runtime security (Falco) monitors what containers are doing at the syscall level -- detecting attacks as they happen with second-level latency.
Falco default rules
Use for: Out-of-box detection for: shell spawned in container, write to /etc or /usr, unexpected outbound connection, capability escalation, privileged container activity, sensitive file access (/etc/shadow, K8s SA tokens). Start with defaults; tune down from there.
Custom Falco rules
Use for: Rules for your specific apps: this container should never open ports, this process should never read /var/run/secrets. Narrow rules = lower false positives. Requires profiling baseline behavior before writing negative rules.
Falco Sidekick
Use for: Falco alert router. Sends alerts to: Slack, PagerDuty, Elasticsearch, Prometheus, S3. Can trigger webhooks for automated response. Deploy alongside Falco to avoid routing logic in Falco config.
The System View: Falco Monitoring Architecture
Node (Linux kernel)
eBPF probe attached to syscall tracepoints
|
v every syscall from every process on node
Falco engine: rule evaluation
Rule: "shell spawned in container"
condition: spawned_process AND container AND shell_binaries
|
v MATCH: /bin/sh in api-service container
Alert generated:
{
rule: "Terminal shell in container",
output: "proc_name=sh container=api-service image=api:v3.2",
priority: CRITICAL,
time: 2022-03-15T02:14:33Z
}
|
v Falco Sidekick
-> Slack: #security-alerts
-> Elasticsearch: incident log
-> Webhook: quarantine-controller (NetworkPolicy isolation)
Zero false negative guarantee: eBPF sees ALL syscalls
Trade-off: ~3-5% CPU overhead on nodeeBPF attaches to kernel syscall tracepoints; every process on every container is monitored; rules fire in microseconds
Runtime Security Response Patterns
Falco detects reverse shell; automated pod kill
“False positive on legitimate debug session kills production pod; availability incident created by security tooling”
“Falco alert -> NetworkPolicy quarantine (isolate pod, preserve forensics) -> human review -> terminate if confirmed; 5-minute SLA for review”
Falco default rules generate 200 alerts/day
“Alert fatigue; team ignores all Falco alerts; actual attack buried in noise”
“Tune rules: add container image allowlist, suppress known-good process spawns; target < 5 alerts/day; every alert is actionable”
How Falco Rules Work
Custom Falco rule for API service monitoring
01
1. Profile baseline behavior: run Falco in dry-run mode for 1 week; collect all syscall patterns
02
2. Identify normal: api-service always reads /etc/ssl, connects to db.internal:5432, spawns no subprocesses
03
3. Write negative rule: alert if api-service spawns any process (it never should)
04
4. Write negative rule: alert if api-service opens outbound connections to non-approved IPs
05
5. Deploy rules in warn mode for 1 week; tune false positives (scheduled health checks, etc.)
06
6. Promote to alert mode; route CRITICAL to PagerDuty, WARNING to Slack
1. Profile baseline behavior: run Falco in dry-run mode for 1 week; collect all syscall patterns
2. Identify normal: api-service always reads /etc/ssl, connects to db.internal:5432, spawns no subprocesses
3. Write negative rule: alert if api-service spawns any process (it never should)
4. Write negative rule: alert if api-service opens outbound connections to non-approved IPs
5. Deploy rules in warn mode for 1 week; tune false positives (scheduled health checks, etc.)
6. Promote to alert mode; route CRITICAL to PagerDuty, WARNING to Slack
1- rule: Shell Spawned in API Service2desc: Detect any shell spawned in the api service container3condition: >4spawned_process5and containerScope rule to specific image -- prevents noise from other containers on the node6and container.image.repository = "registry.company.com/api"7and proc.name in (shell_binaries)8output: >9Shell spawned in api container10(user=%user.name proc=%proc.name container=%container.id)11priority: CRITICAL12tags: [container, shell, attack]1314- rule: Unexpected Outbound Connection from API Service15desc: api service should only connect to approved backends16condition: >17outboundallowed_outbound_ips: define a list macro with DB, cache, and API gateway IPs18and container.image.repository = "registry.company.com/api"19and not (fd.sip in (allowed_outbound_ips))20output: >21Unexpected outbound connection (dest=%fd.rip port=%fd.rport)22priority: WARNING
What Breaks in Production: Blast Radius
Runtime security failure modes
- Alert fatigue from untuned rules — Default Falco rules fire on many legitimate activities in busy clusters. Without tuning, alerts flood Slack; team disables notifications; actual attack goes undetected. Spend 2 weeks tuning before treating Falco as a security control.
- eBPF probe kernel compatibility — Falco eBPF requires kernel >= 4.14. Some distributions use patched kernels that break eBPF. Test Falco on your exact kernel version before deploying to production. Have fallback to kernel module mode if eBPF fails.
- CPU overhead at high syscall rate — Falco eBPF adds ~3-5% CPU overhead per node. For IO-intensive workloads (databases, storage controllers) making millions of syscalls/second, this can be significant. Profile before deploying to storage nodes.
- Falco DaemonSet not privileged enough — Falco needs access to /proc, /sys, and the eBPF subsystem. Deploying in a namespace with restricted PSS breaks Falco. Falco must run in a privileged namespace with appropriate securityContext.
Automated pod kill on all Falco alerts creates availability risk
# Falco Sidekick webhook config
outputs:
webhook:
address: http://pod-killer-service
# Triggers kubectl delete pod on ANY Falco CRITICAL alert
# Problem: false positive on legitimate admin debug session
# -> kills production pod
# -> creates availability incident
# -> security tooling becomes the threat# Better: quarantine + alert + human review
outputs:
slack:
webhookurl: https://hooks.slack.com/...
channel: "#security-critical"
webhook:
address: http://quarantine-controller
# Quarantine: add NetworkPolicy isolating pod
# Preserves pod for forensics
# Human reviews within 5-min SLA
# Terminates ONLY after human confirmationAutomated termination on all Falco alerts creates availability risk from false positives. Quarantine (network isolation) preserves forensic evidence while blocking lateral movement. Human-in-the-loop review ensures only confirmed attacks result in pod termination.
Decision Guide: Runtime Security Deployment
Cost and Complexity: Runtime Security Tools
| Tool | Detection method | Overhead | False positive rate | When to use |
|---|---|---|---|---|
| Falco (eBPF) | Syscall tracing | 3-5% CPU | Medium (needs tuning) | Primary runtime detection for most clusters |
| Falco (kernel module) | Syscall tracing | 5-8% CPU | Medium | Kernel too old for eBPF |
| Tetragon (Cilium) | eBPF + K8s-aware | 2-4% CPU | Low (K8s context) | Cilium clusters wanting tighter integration |
| AppArmor / Seccomp | Syscall blocking | Minimal | Zero (blocks vs alerts) | Defense in depth alongside Falco |
Exam Answer vs. Production Reality
How Falco detects attacks
📖 What the exam expects
Falco uses eBPF (or kernel module) to intercept syscalls for every process in every container on the node. Rules define suspicious syscall patterns. Alerts fire when patterns match.
Toggle between what certifications teach and what production actually requires
How this might come up in interviews
Security design questions about defense in depth for container workloads and incident response questions about detecting attacks in production.
Common questions:
- How does Falco detect attacks inside containers?
- What is the difference between runtime security and container scanning?
- How would you reduce false positives from Falco in production?
- What should happen automatically when Falco fires a critical alert?
Strong answer: Mentions eBPF vs kernel module tradeoffs, Falco Sidekick for alert routing, quarantine-then-review response pattern, and custom rules for specific application behaviors.
Red flags: Thinking Falco replaces image scanning, or not knowing that automated pod termination on all alerts creates availability risk.
Related concepts
Explore topics that connect to this one.
Ready to see how this works in the cloud?
Switch to Career Paths for structured paths (e.g. Developer, DevOps) and provider-specific lessons.
View role-based pathsSign in to track your progress and mark lessons complete.
Discussion
Questions? Discuss in the community or start a thread below.
Join DiscordIn-app Q&A
Sign in to start or join a thread.