Interactive Explainer

Relevant for:Mid-levelSeniorStaff

Why this matters at your level

Junior

Know that every pod gets a unique IP. Pods on the same node communicate via a virtual bridge. Pods on different nodes use CNI overlay (VXLAN or BGP routing). Kubernetes requires pod-to-pod communication without NAT.

Mid-level

Trace the packet path from pod A to pod B using ip route, ip addr, and arp on a node. Understand veth pairs and how they connect pod network namespaces to the node bridge. Debug cross-node pod communication failures by checking each hop.

Senior

Evaluate CNI plugins by networking model and security properties. Design NetworkPolicy for zero-trust pod networking. Debug MTU mismatches. Understand eBPF-based networking vs iptables-based networking and their scale implications.

Staff

Own the CNI selection decision. Evaluate security properties: ARP isolation, encryption, multi-tenant isolation. Design network architecture for PCI or HIPAA workloads. Evaluate Cilium vs Calico vs VPC-native networking against specific compliance requirements.

Container Networking Fundamentals: How Packets Move Between Pods

Container networking is built on virtual ethernet pairs, network namespaces, bridges, and iptables/eBPF rules. Every pod-to-pod communication, every Service ClusterIP, and every CNI plugin decision builds on these primitives. Understanding how packets actually move is the difference between guessing at network problems and knowing exactly which hop to inspect.

~6 min read

Be the first to complete!

LIVEData Plane Failure — Flannel ARP Poisoning — Multi-Tenant Cluster

Breaking News

T+0

Attacker deploys malicious pod to a node shared with Tenant B pods

T+10m

Attacker pod sends gratuitous ARP replies claiming Tenant B pod IP, poisoning node ARP cache

T+12m

Traffic from other pods destined for Tenant B's IP redirected to attacker pod via poisoned ARP

T+15m

Attacker reads plaintext HTTP API calls including authentication tokens

T+3h

Anomalous traffic pattern detected -- attacker pod receiving traffic for IPs it did not own

—Undetected traffic interception

—Service traffic exposed (no mTLS)

—Attack vector (ARP -- below NetworkPolicy)

—Triggered during the incident

The question this raises

If every pod has its own network namespace, how can one pod intercept traffic intended for another -- and what layer of the network stack is missing protection?

Test your assumption first

A pod on Node A cannot reach a pod on Node B, but pods on the SAME node communicate fine. Both pods show Running with IPs assigned. What is the most likely cause?

Lesson outline

What Problem Container Networking Solves

The Kubernetes networking model: flat IPs, no NAT

Kubernetes requires that every pod can communicate with every other pod directly using its IP, without NAT. This is fundamentally different from Docker's default NAT model where containers use private IPs and the host rewrites addresses. The flat pod network makes routing simpler but requires the CNI plugin to create and maintain routes across all nodes so that 10.244.1.7 is reachable from any pod in the cluster.

veth pair

Use for: Virtual ethernet cable with two ends. One end is eth0 inside the pod network namespace. The other end is on the host and connects to the bridge. Packets entering one end exit the other -- exactly like a physical network cable.

Linux bridge (cni0)

Use for: Virtual Layer 2 switch on each node. All pods on the node connect to this bridge. The bridge forwards packets using MAC addresses and ARP -- exactly like a physical Ethernet switch in a data center.

VXLAN tunnel

Use for: Encapsulates pod-to-pod traffic in UDP packets for cross-node communication. Creates a virtual Layer 2 network over the physical Layer 3 network. The flannel.1 or calico-vxlan interfaces handle encapsulation and decapsulation transparently.

CNI plugin binary

Use for: Executed by kubelet to set up networking for each new pod. Creates the veth pair, assigns the IP from pod CIDR, adds routes to the bridge, and configures the overlay tunnel entries. Runs as a privileged binary on each node, not as a daemon.

The System View: Cross-Node Packet Journey

NODE 1 (physical IP: 192.168.1.10)    NODE 2 (physical IP: 192.168.1.11)

Pod A (10.244.0.5)                    Pod B (10.244.1.7)
+---------------------------+         +---------------------------+
| eth0: 10.244.0.5          |         | eth0: 10.244.1.7          |
| route: 0.0.0.0 via .0.1  |         | route: 0.0.0.0 via .1.1  |
+-----------|---------------+         +-----------|---------------+
            | veth pair                           | veth pair
            v                                     ^
+---------------------------+         +---------------------------+
| cni0 bridge: 10.244.0.1  |         | cni0 bridge: 10.244.1.1  |
| route: 10.244.1.0/24     |         |                           |
|   -> via flannel.1       |         |                           |
+-----------|---------------+         +-----------|---------------+
            | VXLAN encapsulate                   | VXLAN decapsulate
            v                                     ^
+---------------------------+         +---------------------------+
| flannel.1 (VXLAN iface)  | UDP --> | flannel.1 (VXLAN iface)  |
| src=192.168.1.10:8472    |         | dst=192.168.1.11:8472    |
| inner: 0.5 -> 1.7       |         | inner: 0.5 -> 1.7        |
+---------------------------+         +---------------------------+
            |                                     ^
            v                                     |
     [Physical Network: 192.168.1.0/24] ----------+

NOTE: Cloud security groups MUST allow UDP 8472 (Flannel) or 4789 (Calico/Cilium)
      between all node IPs -- this is the most common cross-node networking failure.

The VXLAN tunnel wraps pod IP packets (inner) inside UDP packets (outer) using node IPs for transport. The inner packet preserves the original pod IPs throughout -- no NAT at any hop.

Container networking misconceptions

Situation

Before

After

Pod A sends a packet to Pod B on a different node

“Pods use NAT -- the packet goes out through the node IP, gets rewritten at the node, and the destination node rewrites it back. Like how home network devices share one public IP.”

“Pods communicate using their actual pod IPs with no NAT. The VXLAN tunnel encapsulates the pod-IP packet in a UDP packet using node IPs for transport, then unwraps it on the destination node, delivering the original pod-IP packet. Pod IPs are preserved end-to-end.”

A NetworkPolicy is applied to a namespace

“The Kubernetes API server enforces the NetworkPolicy, blocking traffic at the control plane level when pods try to communicate.”

“NetworkPolicy is enforced by the CNI plugin running on each node, which translates the policy into iptables rules or eBPF programs on that node. The API server stores the policy object but does not enforce it. If your CNI plugin does not support NetworkPolicy (e.g., Flannel), the policies are stored but silently ignored -- no blocking occurs.”

How It Actually Works: Packet Trace Step by Step

Every hop a cross-node packet takes

→

1. Pod A sends to 10.244.1.7 -- the container process opens a socket. The kernel in Pod A's network namespace looks up its routing table. Route: 0.0.0.0/0 via 10.244.0.1 (the cni0 bridge gateway). Packet exits eth0 (pod side of veth pair).

→

2. veth pair delivers to cni0 bridge -- the packet crosses the veth pair and arrives at the host namespace side. The bridge sees destination 10.244.1.7 -- not a local pod (local pods are 10.244.0.x). Bridge forwards to the host routing table.

→

3. Host routing table selects VXLAN interface -- the node's routing table has: 10.244.1.0/24 via dev flannel.1 (added by the CNI plugin when the cluster was set up or when a new node joined). Packet handed to flannel.1.

→

4. VXLAN encapsulation for cross-node transport -- flannel.1 looks up its FDB (forwarding database) to find which node owns 10.244.1.0/24 (Node 2 at 192.168.1.11). It encapsulates the pod IP packet in UDP: outer src=192.168.1.10 dst=192.168.1.11, UDP port 8472. This packet travels over the physical network.

5. Destination node decapsulates and delivers -- Node 2 receives UDP on port 8472. The VXLAN module decapsulates it, recovering the inner pod IP packet (src=10.244.0.5, dst=10.244.1.7). Host routing table forwards to cni0 bridge. ARP finds the MAC of Pod B via its veth pair. Pod B receives the packet on eth0.

debug-cross-node-networking.sh

1# Get pod IPs and node assignments
2$ kubectl get pods -o wide
3NAME    READY   IP            NODE
4pod-a   1/1     10.244.0.5    node-1
5pod-b   1/1     10.244.1.7    node-2
6 
7# From pod-a, test connectivity to pod-b
8$ kubectl exec -it pod-a -- ping 10.244.1.7
9# If this fails, narrow down the layer:
10 
11# ON NODE-1: check routing table shows route to pod-b subnet
12$ ip route show | grep 10.244.1
1310.244.1.0/24 via 10.244.1.0 dev flannel.1 onlink
14# Missing? CNI plugin may not have set up routes for node-2
The route 10.244.1.0/24 via flannel.1 is added by the CNI plugin. If it is missing, the CNI did not successfully register the node or its routes.
15 
16# ON NODE-1: check VXLAN FDB (which node owns which pod subnet)
17$ bridge fdb show dev flannel.1
1852:c5:3a:b3:e4:f1 dst 192.168.1.11 self permanent
19# Maps MAC to node-2 IP -- this is how VXLAN knows where to send
The VXLAN FDB maps remote pod MAC addresses to the node IP that owns them. This is populated by the CNI control plane (Flannel daemon) watching for new nodes.
20 
21# ON NODE-1: can node-1 reach node-2 on VXLAN port?
22$ nc -vzu 192.168.1.11 8472
23# If fails: cloud security group blocking UDP 8472 -- MOST COMMON CAUSE
UDP port 8472 (Flannel) or 4789 (Calico/Cilium VXLAN) MUST be allowed between all node IPs in cloud security groups. This is the #1 cause of cross-node pod networking failures.
24 
25# Check for MTU issues (VXLAN adds ~50 bytes overhead)
26$ ping -M do -s 1450 192.168.1.11
27# If 1450 fails but 1400 works: MTU mismatch
28# Fix: set pod MTU to 1450 in CNI config (default for VXLAN overlay)

What Breaks in Production: Blast Radius

Blast radius: container networking failure modes

VXLAN port blocked by cloud security group — All cross-node pod communication fails silently. Same-node pods work fine. Most common cause of the "cross-node but not same-node" failure pattern.
MTU mismatch (VXLAN overhead not accounted for) — Large packets fragmented or dropped. TCP connections work but HTTP responses with large bodies fail intermittently. Affects ~50% of API calls depending on response size.
conntrack table full — New TCP connections fail with silent drops or "connection refused". Existing connections continue. Creates mysterious intermittent failures at scale (Cloudflare 2019 incident).
cni0 bridge missing after node restart — All pods on node lose network connectivity. Requires CNI plugin restart or node drain and recycle -- affects all workloads on that node.
ARP table stale after pod reschedule — Old pod IP cached in bridge ARP table for 60 seconds. New pod at same IP unreachable until ARP cache expires -- causes mysterious short-term failures after pod restarts.
Pod CIDR overlaps existing network range — Pods cannot communicate with external services on the overlapping range. No error message -- silent routing black hole. Plan pod CIDR carefully before cluster creation.

No NetworkPolicy -- flat network allows any pod to reach any pod

Bug

# Default: NO NetworkPolicy in namespace
# Any pod anywhere in the cluster can reach the production database.
# A compromised pod in 'dev' namespace connects to 'production' DB.
apiVersion: apps/v1
kind: Deployment
metadata:
  name: production-database
  namespace: production
spec:
  template:
    spec:
      containers:
      - name: postgres
        image: postgres:15
        ports:
        - containerPort: 5432
  # No NetworkPolicy exists -- any pod in the cluster can connect

Fix

# Default-deny all ingress, then allow only specific sources
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: database-access-policy
  namespace: production
spec:
  podSelector:
    matchLabels:
      role: database
  policyTypes:
  - Ingress
  - Egress
  ingress:
  - from:
    - namespaceSelector:
        matchLabels:
          env: production       # Only production namespace
      podSelector:
        matchLabels:
          app: api-server       # Only api-server pods
    ports:
    - port: 5432
      protocol: TCP
  egress: []   # No egress allowed (DB only responds, never initiates)

Without NetworkPolicy, the pod network is flat -- every pod can reach every other pod regardless of namespace. NetworkPolicy adds L3/L4 firewall rules enforced by the CNI plugin using iptables/eBPF on each node. Apply a default-deny policy per namespace and explicitly whitelist only required connections. Note: Flannel does not enforce NetworkPolicy -- you need Calico, Cilium, or Weave for enforcement.

Decision Guide: Which CNI Plugin?

Do you need NetworkPolicy enforcement AND advanced observability (eBPF, L7 visibility, Hubble)?

YesUse Cilium -- eBPF-based, no iptables conntrack issues, Hubble network observability, L7-aware NetworkPolicy. Requires kernel 4.9+.

NoContinue.

Do you need BGP peering with existing network infrastructure (bare metal or on-prem)?

YesUse Calico in native routing mode -- no VXLAN overhead, BGP peers with existing routers, excellent NetworkPolicy support.

NoContinue.

Is this a managed cloud cluster (EKS, GKE, AKS) where VPC-native networking is available?

YesUse the cloud-native CNI (AWS VPC CNI, GKE native networking) for best performance and simplest operation. Add Calico for NetworkPolicy if needed.

NoUse Flannel for simplicity or Weave Net for simplicity with NetworkPolicy support.

Cost and Complexity: CNI Plugin Trade-offs

CNI Plugin	Networking Model	NetworkPolicy	Performance	Complexity
Flannel	VXLAN overlay	No (needs Calico policy overlay)	Medium (VXLAN overhead)	Low -- simple to operate
Calico	BGP or VXLAN	Yes (iptables or eBPF)	High (BGP native: no overlay)	Medium -- BGP config for native mode
Cilium	eBPF (no iptables)	Yes (L3/L4/L7 via eBPF)	Highest (bypasses conntrack)	High -- modern kernel required
AWS VPC CNI	VPC-native (ENI)	Yes (via Calico or native policy)	Highest (native VPC routing)	Medium -- AWS-only, IP exhaustion risk
Weave Net	VXLAN or fast datapath	Yes (iptables)	Medium	Low -- automated mesh setup

Exam Answer vs. Production Reality

1 / 2

How pod-to-pod traffic moves

📖 What the exam expects

Each pod has a network namespace with a veth pair: one end (eth0) in the pod namespace, one end on the host connected to the cni0 bridge. Cross-node traffic uses CNI overlay (VXLAN encapsulates pod IP packets in UDP) or BGP routing (Calico native mode routes pod IPs directly). No NAT between pods.

Toggle between what certifications teach and what production actually requires

How this might come up in interviews

Appears in network troubleshooting questions, CNI selection discussions, and security architecture interviews. "Trace a packet from Pod A to Pod B on a different node" is a classic senior Kubernetes interview question.

Common questions:

Trace the packet path from Pod A on Node 1 to Pod B on Node 2
What is a veth pair and how does it connect a pod to the node network?
What is the difference between overlay networking and BGP-based routing?
How does iptables conntrack table exhaustion affect Kubernetes networking?
Why might pods on the same node communicate but fail across nodes?

Strong answer: Mentioning conntrack table exhaustion (Cloudflare 2019 incident). Knowing Cilium bypasses iptables with eBPF. Discussing MTU and VXLAN overhead. Being able to use ip route, ip addr, and bridge fdb on a node to trace packet paths.

Red flags: Thinking pods use NAT to communicate (Kubernetes requires direct pod-to-pod routing). Not knowing what a veth pair is. Believing NetworkPolicy is enforced by the API server (it is enforced by the CNI plugin on each node).

Related concepts

Explore topics that connect to this one.

Ready to see how this works in the cloud?

Switch to Career Paths for structured paths (e.g. Developer, DevOps) and provider-specific lessons.

View role-based paths

Discussion

Questions? Discuss in the community or start a thread below.

Join Discord

Container Networking Fundamentals: How Packets Move Between Pods

What Problem Container Networking Solves

The System View: Cross-Node Packet Journey

How It Actually Works: Packet Trace Step by Step

What Breaks in Production: Blast Radius

Decision Guide: Which CNI Plugin?

Cost and Complexity: CNI Plugin Trade-offs

Exam Answer vs. Production Reality

Discussion

In-app Q&A

Container Networking Fundamentals: How Packets Move Between Pods

What Problem Container Networking Solves

The System View: Cross-Node Packet Journey

How It Actually Works: Packet Trace Step by Step

What Breaks in Production: Blast Radius

Decision Guide: Which CNI Plugin?

Cost and Complexity: CNI Plugin Trade-offs

Exam Answer vs. Production Reality

Discussion

In-app Q&A