Hands-on Kubernetes & DevOpsPart 6 of 6 — open for all chapters
Zero-Trust Networking in Kubernetes: Network Policies with Calico
Why flat pod-to-pod networking is risky, how NetworkPolicies enforce allow-lists at the CNI layer, and how to prove isolation with Calico-backed tests and blast-radius checks.
Quick navigation
- The problem: flat networking by default
- How Network Policies work
- Prerequisites: Calico CNI
- Document the "before" state
- The policies
- Testing: prove what you enforced
- The blast-radius test
- Common mistakes
- Production patterns
- Key takeaways
The problem: flat networking by default
Picture a fresh cluster: two namespaces, dev and prod. It is easy to assume they are isolated—different namespaces, different environments, different teams. In reality, the data plane does not care about those boundaries unless you add policy.
A pod in dev can call a Service in prod today:
# Pod in dev namespace calling prod service
kubectl run test --image=alpine --rm -it --restart=Never \
-n dev -- wget -qO- --timeout=5 \
http://my-api-service.prod.svc.cluster.local/health
# Returns: {"healthy":true,"uptime":5718}
That succeeds because by default, every pod can talk to every other pod (cluster networking is permissive until you restrict it). Any workload can reach your databases, internal APIs, and control-plane-adjacent components reachable from the pod network.
That is tolerable for experiments. In production, it means one compromised workload can move laterally across the whole cluster with few network-level guardrails.
NetworkPolicies are Kubernetes’ answer: declarative firewall rules for pod traffic, enforced by your CNI when it implements the API.
How Network Policies work
Use this mental model:
No NetworkPolicy selects a pod → all traffic allowed (for that pod)
Some NetworkPolicy selects a pod → only explicitly allowed traffic is permitted
If several policies apply to the same pod, allowed traffic is the union of everything those policies permit. Policies are additive allows; there is no priority mechanism and no “deny rule” primitive inside a policy—everything is framed as exceptions to the default isolation once a pod is selected.
Three fields show up everywhere:
podSelector — which pods this policy applies to in its namespace. {} selects every pod in that namespace.
namespaceSelector / peer selectors — where traffic may come from or go to (namespaces, pods, or IP blocks).
policyTypes — Ingress for inbound, Egress for outbound. You can declare one or both; omitting egress under an egress‑aware CNI defaults to permissive egress until you tighten it.
Prerequisites: Calico CNI
NetworkPolicy objects are standard resources, but they only do something if the CNI enforces them. The default Minikube networking path often does not. Calico does.
minikube start --driver=docker --memory=3500 --cpus=4 --cni=calico
kubectl get pods -n kube-system | grep calico
# calico-kube-controllers-xxx 1/1 Running
# calico-node-xxx 1/1 Running
On managed offerings, the knob differs: EKS often uses the Calico Helm chart or operator add‑on paths; GKE enables policy enforcement per cluster; AKS exposes it at create time. The YAML stays portable—only installation and troubleshooting change.
Document the "before" state
Before changing anything, capture evidence. Run the same cross‑namespace probe you expect to block later:
kubectl run test-before --image=alpine --rm -it --restart=Never \
-n dev -- wget -qO- --timeout=5 \
http://my-api-service.prod.svc.cluster.local/health
# {"healthy":true,"uptime":5718} ← save this; after policies, this should fail
That screenshot or log excerpt is what you attach to a security review: before permissive, after denied.
The policies
Below are three policies that work together for ingress isolation inside the dev namespace: deny by default, then allow same‑namespace traffic to the API pods. DNS is handled with a focused egress rule so name resolution keeps working once you constrain egress.
# Policy 1 — default deny ingress for every pod in the namespace
# Any pod selected here only accepts ingress explicitly allowed elsewhere
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny-ingress
namespace: dev
spec:
podSelector: {}
policyTypes:
- Ingress
---
# Policy 2 — allow ingress only from pods in the same namespace
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-same-namespace
namespace: dev
spec:
podSelector:
matchLabels:
app: my-api-dev
policyTypes:
- Ingress
ingress:
- from:
- podSelector: {} # any pod in namespace dev
---
# Policy 3 — allow DNS egress (easy to forget; without it, Service DNS breaks)
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-dns-egress
namespace: dev
spec:
podSelector: {}
policyTypes:
- Egress
egress:
- ports:
- protocol: UDP
port: 53
- protocol: TCP
port: 53
Apply and verify:
kubectl apply -f gitops/my-api/networkpolicy.yaml
kubectl get networkpolicies -n dev
kubectl port-forward note: Strict ingress policies often mean traffic from your laptop does not look like “in-namespace” pod traffic. Prefer kubectl exec into an allowed client pod for tests, or add a separate, narrowly scoped policy for break-glass access—avoid cidr: 0.0.0.0/0 on workload ports unless you consciously accept cluster-wide exposure.
Testing: prove what you enforced
Assume nothing about YAML. Exercise the paths that matter:
# Cross-namespace — should FAIL once both sides enforce isolation appropriately
kubectl run test-after --image=alpine --rm -it --restart=Never \
-n prod -- wget -qO- --timeout=5 \
http://my-api-service.dev.svc.cluster.local/health
# wget: download timed out ✅
# Same namespace — should still SUCCEED
kubectl run test-same --image=alpine --rm -it --restart=Never \
-n dev -- wget -qO- --timeout=5 \
http://my-api-service.dev.svc.cluster.local/health
# {"healthy":true,"uptime":6115} ✅
Mirror the manifests for prod when that namespace should also be segmented:
sed 's/namespace: dev/namespace: prod/g; s/app: my-api-dev/app: my-api-prod/g' \
gitops/my-api/networkpolicy.yaml | kubectl apply -f -
Until policies exist in both namespaces (or your default posture is deny everywhere), results can look inconsistent—always confirm which selectors actually apply.
The blast-radius test
This is the narrative that resonates in reviews: simulate a rogue pod and list what still works.
kubectl run attacker --image=alpine --rm -it --restart=Never -n dev -- sh
From inside:
# Can reach dev services — same namespace (if your allow rules cover the path)
wget -qO- --timeout=3 http://my-api-service.dev.svc.cluster.local/health
# Cannot reach prod — cross-namespace ingress should be refused
wget -qO- --timeout=3 http://my-api-service.prod.svc.cluster.local/health
# Often cannot reach the Kubernetes API without explicit egress rules
wget -qO- --timeout=3 https://kubernetes.default.svc.cluster.local
With the policies above, containment goals look like:
- No casual path from
devworkload network toprodServices. - Reduced chance of trivial API chatter unless you deliberately allow apiserver egress.
- Blast radius bounded by namespace and label selectors instead of “the whole mesh is trusted.”
The difference is not academic: it separates “one bad deploy in dev” from “someone pivots everywhere.”
Common mistakes
Skipping DNS egress. The moment you add any egress restriction, forgetting UDP/TCP 53 to the cluster DNS Service breaks resolution. Symptoms look like flaky wget http://mysvc with “bad address” even though the Service exists.
Matching Service ports instead of container ports. NetworkPolicy evaluates pod ports. If Service 80 forwards to container 3000, allow 3000 (and the protocols you actually use).
Testing with naked pods. kubectl run test --image=alpine usually creates pods without the labels your from.podSelector expects, so failures look like “policy broken” when the test harness is wrong.
Patching only one namespace. dev lockdown does not magically protect prod. Treat each namespace as its own policy surface.
Production patterns
Metrics scrapes across namespaces. Prometheus in monitoring needs a path in. Allow ingress from that namespace on the scrape port:
ingress:
- from:
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: monitoring
ports:
- protocol: TCP
port: 3000
Microservice allow-lists. Prefer calling out real clients instead of “any pod in this namespace” when you can:
ingress:
- from:
- podSelector:
matchLabels:
app: api-gateway
Controlled egress to the internet. Example shape: permit 443 outbound while discouraging arbitrary RFC1918 ranges (tune CIDRs to your topology—except lists get brittle fast):
egress:
- to:
- ipBlock:
cidr: 0.0.0.0/0
except:
- 10.0.0.0/8
- 172.16.0.0/12
- 192.168.0.0/16
ports:
- protocol: TCP
port: 443
Key takeaways
- Kubernetes pod networking is flat until NetworkPolicies (and a capable CNI) say otherwise.
- A namespace-wide default-deny ingress posture flips you into explicit allow-listing for affected pods.
- DNS egress is part of the minimum viable policy set whenever you touch egress.
- Enforcement quality depends on the CNI (Calico and Cilium are common production choices—verify behavior in your cluster).
- Automate repeatable
kubectl run/execprobes; YAML alone is not proof. - Design around blast radius: if this identity is compromised, what does the mesh still let it touch?
Part of a hands-on DevOps learning series. Code at github.com/kaungmyathan22/golang-k8s-portfolio.