Back
Interactive Explainer

Runtime Security: Falco and Anomaly Detection

Runtime security detects attacks happening inside running containers -- after all other controls have been bypassed. Falco monitors kernel syscalls using eBPF to alert on container escapes, credential access, reverse shells, and privilege escalation in real time.

Relevant for:SeniorStaff
Why this matters at your level
Senior

Deploy Falco as a DaemonSet. Understand default rules (shell in container, sensitive file access, privilege escalation). Configure alert routing to SIEM/Slack. Tune rules to reduce false positives.

Staff

Design runtime security strategy: Falco for detection, automated pod quarantine via webhook on alert, integration with incident response playbooks, custom rules for your specific application threat model.

Runtime Security: Falco and Anomaly Detection

Runtime security detects attacks happening inside running containers -- after all other controls have been bypassed. Falco monitors kernel syscalls using eBPF to alert on container escapes, credential access, reverse shells, and privilege escalation in real time.

~3 min read
Be the first to complete!
LIVEContainer Breakout Detected -- Falco Alert -- Production -- 2022
T+0

Command injection exploit spawns /bin/sh in running api-service container

T+3s

Falco eBPF rule fires: "Terminal shell in container" alert sent to Slack

T+5m

Attacker downloads miner binary via curl

T+8m

Security team receives alert; investigates pod logs + Falco event stream

T+10m

Pod terminated; RCE patched; analysis shows attacker had 10 minutes of access

Falco detection latency for shell spawn
Total attacker access before containment
Application logs revealing the attack

The question this raises

If an attacker gets code execution inside your container, what visibility do you have -- and what is the detection latency before they can establish persistence?

Test your assumption first

Falco is deployed as a DaemonSet. A privileged container on node A runs nsenter to access the host network namespace. Will Falco on node A detect this?

Lesson outline

What Runtime Security Solves

The Detection Gap After Deployment

All preventive controls (RBAC, admission, image scanning) operate before the container runs. Once a container is running, if an attacker exploits a vulnerability inside it, no preventive control stops them. Runtime security (Falco) monitors what containers are doing at the syscall level -- detecting attacks as they happen with second-level latency.

Falco default rules

Use for: Out-of-box detection for: shell spawned in container, write to /etc or /usr, unexpected outbound connection, capability escalation, privileged container activity, sensitive file access (/etc/shadow, K8s SA tokens). Start with defaults; tune down from there.

Custom Falco rules

Use for: Rules for your specific apps: this container should never open ports, this process should never read /var/run/secrets. Narrow rules = lower false positives. Requires profiling baseline behavior before writing negative rules.

Falco Sidekick

Use for: Falco alert router. Sends alerts to: Slack, PagerDuty, Elasticsearch, Prometheus, S3. Can trigger webhooks for automated response. Deploy alongside Falco to avoid routing logic in Falco config.

The System View: Falco Monitoring Architecture

Node (Linux kernel)
  eBPF probe attached to syscall tracepoints
       |
       v  every syscall from every process on node
  Falco engine: rule evaluation
    Rule: "shell spawned in container"
      condition: spawned_process AND container AND shell_binaries
       |
       v  MATCH: /bin/sh in api-service container
  Alert generated:
    {
      rule: "Terminal shell in container",
      output: "proc_name=sh container=api-service image=api:v3.2",
      priority: CRITICAL,
      time: 2022-03-15T02:14:33Z
    }
       |
       v  Falco Sidekick
    -> Slack: #security-alerts
    -> Elasticsearch: incident log
    -> Webhook: quarantine-controller (NetworkPolicy isolation)

Zero false negative guarantee: eBPF sees ALL syscalls
Trade-off: ~3-5% CPU overhead on node

eBPF attaches to kernel syscall tracepoints; every process on every container is monitored; rules fire in microseconds

Runtime Security Response Patterns

Situation
Before
After

Falco detects reverse shell; automated pod kill

False positive on legitimate debug session kills production pod; availability incident created by security tooling

Falco alert -> NetworkPolicy quarantine (isolate pod, preserve forensics) -> human review -> terminate if confirmed; 5-minute SLA for review

Falco default rules generate 200 alerts/day

Alert fatigue; team ignores all Falco alerts; actual attack buried in noise

Tune rules: add container image allowlist, suppress known-good process spawns; target < 5 alerts/day; every alert is actionable

How Falco Rules Work

Custom Falco rule for API service monitoring

1

1. Profile baseline behavior: run Falco in dry-run mode for 1 week; collect all syscall patterns

2

2. Identify normal: api-service always reads /etc/ssl, connects to db.internal:5432, spawns no subprocesses

3

3. Write negative rule: alert if api-service spawns any process (it never should)

4

4. Write negative rule: alert if api-service opens outbound connections to non-approved IPs

5

5. Deploy rules in warn mode for 1 week; tune false positives (scheduled health checks, etc.)

6

6. Promote to alert mode; route CRITICAL to PagerDuty, WARNING to Slack

falco-custom-rules.yaml
1- rule: Shell Spawned in API Service
2 desc: Detect any shell spawned in the api service container
3 condition: >
4 spawned_process
5 and container
Scope rule to specific image -- prevents noise from other containers on the node
6 and container.image.repository = "registry.company.com/api"
7 and proc.name in (shell_binaries)
8 output: >
9 Shell spawned in api container
10 (user=%user.name proc=%proc.name container=%container.id)
11 priority: CRITICAL
12 tags: [container, shell, attack]
13
14- rule: Unexpected Outbound Connection from API Service
15 desc: api service should only connect to approved backends
16 condition: >
17 outbound
allowed_outbound_ips: define a list macro with DB, cache, and API gateway IPs
18 and container.image.repository = "registry.company.com/api"
19 and not (fd.sip in (allowed_outbound_ips))
20 output: >
21 Unexpected outbound connection (dest=%fd.rip port=%fd.rport)
22 priority: WARNING

What Breaks in Production: Blast Radius

Runtime security failure modes

  • Alert fatigue from untuned rulesDefault Falco rules fire on many legitimate activities in busy clusters. Without tuning, alerts flood Slack; team disables notifications; actual attack goes undetected. Spend 2 weeks tuning before treating Falco as a security control.
  • eBPF probe kernel compatibilityFalco eBPF requires kernel >= 4.14. Some distributions use patched kernels that break eBPF. Test Falco on your exact kernel version before deploying to production. Have fallback to kernel module mode if eBPF fails.
  • CPU overhead at high syscall rateFalco eBPF adds ~3-5% CPU overhead per node. For IO-intensive workloads (databases, storage controllers) making millions of syscalls/second, this can be significant. Profile before deploying to storage nodes.
  • Falco DaemonSet not privileged enoughFalco needs access to /proc, /sys, and the eBPF subsystem. Deploying in a namespace with restricted PSS breaks Falco. Falco must run in a privileged namespace with appropriate securityContext.

Automated pod kill on all Falco alerts creates availability risk

Bug
# Falco Sidekick webhook config
outputs:
  webhook:
    address: http://pod-killer-service
    # Triggers kubectl delete pod on ANY Falco CRITICAL alert
    # Problem: false positive on legitimate admin debug session
    # -> kills production pod
    # -> creates availability incident
    # -> security tooling becomes the threat
Fix
# Better: quarantine + alert + human review
outputs:
  slack:
    webhookurl: https://hooks.slack.com/...
    channel: "#security-critical"
  webhook:
    address: http://quarantine-controller
    # Quarantine: add NetworkPolicy isolating pod
    # Preserves pod for forensics
    # Human reviews within 5-min SLA
    # Terminates ONLY after human confirmation

Automated termination on all Falco alerts creates availability risk from false positives. Quarantine (network isolation) preserves forensic evidence while blocking lateral movement. Human-in-the-loop review ensures only confirmed attacks result in pod termination.

Decision Guide: Runtime Security Deployment

Does your cluster handle sensitive data or have compliance requirements?
YesDeploy Falco as DaemonSet; tune rules for 2 weeks before enabling alerts; integrate with SIEM
NoFalco is still valuable for detection; deploy in audit-only mode initially
Do Falco alerts need automated response?
YesImplement quarantine (NetworkPolicy isolation) not pod kill; require human confirmation before termination
NoAlerting to Slack/PagerDuty with human response is sufficient for most environments
Is the kernel version compatible with eBPF?
YesUse Falco eBPF driver (lower overhead, no kernel module needed)
NoUse Falco kernel module driver; test on staging first; or use Tetragon (Cilium) as alternative

Cost and Complexity: Runtime Security Tools

ToolDetection methodOverheadFalse positive rateWhen to use
Falco (eBPF)Syscall tracing3-5% CPUMedium (needs tuning)Primary runtime detection for most clusters
Falco (kernel module)Syscall tracing5-8% CPUMediumKernel too old for eBPF
Tetragon (Cilium)eBPF + K8s-aware2-4% CPULow (K8s context)Cilium clusters wanting tighter integration
AppArmor / SeccompSyscall blockingMinimalZero (blocks vs alerts)Defense in depth alongside Falco

Exam Answer vs. Production Reality

1 / 3

How Falco detects attacks

📖 What the exam expects

Falco uses eBPF (or kernel module) to intercept syscalls for every process in every container on the node. Rules define suspicious syscall patterns. Alerts fire when patterns match.

Toggle between what certifications teach and what production actually requires

How this might come up in interviews

Security design questions about defense in depth for container workloads and incident response questions about detecting attacks in production.

Common questions:

  • How does Falco detect attacks inside containers?
  • What is the difference between runtime security and container scanning?
  • How would you reduce false positives from Falco in production?
  • What should happen automatically when Falco fires a critical alert?

Strong answer: Mentions eBPF vs kernel module tradeoffs, Falco Sidekick for alert routing, quarantine-then-review response pattern, and custom rules for specific application behaviors.

Red flags: Thinking Falco replaces image scanning, or not knowing that automated pod termination on all alerts creates availability risk.

Suggested next

Often learned after this topic.

Network Segmentation & Zero Trust in Kubernetes

Ready to see how this works in the cloud?

Switch to Career Paths for structured paths (e.g. Developer, DevOps) and provider-specific lessons.

View role-based paths

Sign in to track your progress and mark lessons complete.

Discussion

Questions? Discuss in the community or start a thread below.

Join Discord

In-app Q&A

Sign in to start or join a thread.