Observability & Maintenance
Tell whether an app is healthy, read what it is doing, debug it, and keep manifests current.
The full loop: understand → drill the commands → prove it under the clock. Every objective below is taught here and practised in the drills.
1Understand it
Liveness, readiness & startup probes
Probes let Kubernetes check your app's health. Liveness asks "is this stuck? restart it if so". Readiness asks "is this ready for traffic? if not, stop sending requests (don't kill it)". Startup gives slow-booting apps time before the others kick in.
- Liveness failure → container restarted.
- Readiness failure → Pod removed from Service endpoints (no restart).
- Put dependency checks in readiness, not liveness, to avoid restart storms.
Liveness + readiness on a container
livenessProbe: # restart if this fails
httpGet: { path: /healthz, port: 8080 }
initialDelaySeconds: 5
periodSeconds: 10
readinessProbe: # stop traffic if this fails
httpGet: { path: /ready, port: 8080 }
failureThreshold: 3Logs & events
When something is wrong, two commands tell most of the story. kubectl logs shows what a container printed; kubectl describe shows recent Events, scheduling failures, image pull errors, probe failures.
- kubectl logs pod [-c container] [--previous] for crashed containers.
- kubectl describe pod surfaces Events at the bottom, read them first.
- kubectl get events --sort-by=.lastTimestamp for a timeline.
Debugging & metrics
Beyond logs, exec into a running container to poke around, and use kubectl top (when metrics-server is installed) for live CPU/memory. These find why something is slow or crashing.
- kubectl exec -it pod -- sh to get a shell inside.
- kubectl top pod / node for live resource usage.
- kubectl get pod -o wide shows node and IP for network debugging.
API deprecations
Kubernetes evolves, and old API versions get removed in newer releases (e.g. an object that used a beta apiVersion now needs the stable one). Apply a manifest with a removed apiVersion and it fails, you fix it by updating the apiVersion to the current served one.
- apiVersion must match what the cluster version still serves.
- Deprecated ≠ removed, you get warnings before a version disappears.
- kubectl explain and kubectl api-resources show the current version.
Find the current apiVersion for a kind
kubectl explain deployment # shows: VERSION: apps/v1 kubectl api-resources | grep deployment
2 Drill the commands & prove it
Mastery, 0/6 objectives
An objective turns green only when you've solved every drill in it, not just one.
- Define and inspect liveness, readiness, and startup probes via explain and generate-then-edit YAML0/6
- Retrieve container logs including previous, multi-container, follow, since, and tail variants0/5
- List and sort cluster events for troubleshooting across namespaces0/2
- Debug running pods with describe, exec, top, port-forward, and ephemeral debug containers0/8
- Extract specific fields using jsonpath, custom-columns, sort-by, and wide output0/4
- Discover API resources, versions, and deprecations with api-resources, api-versions, and explain0/2
You need the exact field names under the liveness probe. Use kubectl explain to show the fields of the container livenessProbe.
Drills check the command pattern deterministically, there is often more than one correct form. For full fidelity, pair this with real-cluster reps (the killer.sh simulator is included free with your exam registration).