Debugging Kubernetes NetworkPolicy: Complete Troubleshooting Guide (2025)
A comprehensive guide to debugging Kubernetes NetworkPolicy issues. Learn how to troubleshoot connection failures, test policies, and fix common NetworkPolicy configuration problems.
What You'll Learn
Quick Diagnosis Checklist
- Is your CNI plugin NetworkPolicy-compatible? (Calico, Cilium, Weave work - Flannel doesn't)
- Are pod labels correct and matching selectors?
- Is the default-deny policy blocking unintended traffic?
- Are namespace selectors configured correctly?
Understanding How NetworkPolicy Works
Kubernetes NetworkPolicy controls traffic flow between pods. Understanding the core concepts is critical for debugging.
Key Concepts
Pod Selection
NetworkPolicy uses podSelector to target specific pods by labels.
Ingress/Egress
Ingress controls incoming traffic, Egress controls outgoing traffic.
Default Deny
Once a pod is selected by ANY NetworkPolicy, all traffic is denied except what's explicitly allowed.
Critical Rule: Default Deny Behavior
THIS IS THE MOST COMMON SOURCE OF CONFUSION
As soon as ONE NetworkPolicy selects a pod:
- ALL traffic to that pod is denied by default
- You must explicitly allow each traffic flow you want
- Multiple policies are additive (OR logic)
Example Scenario
# You create this policy to allow port 80
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-http
spec:
podSelector:
matchLabels:
app: web
ingress:
- from:
- podSelector:
matchLabels:
app: frontend
ports:
- protocol: TCP
port: 80
# Result: Port 80 is allowed, but EVERYTHING ELSE is blocked
# - Port 443? BLOCKED
# - Port 8080? BLOCKED
# - Traffic from other pods? BLOCKED
# - Egress traffic? BLOCKED (if no egress rule exists) Common Symptoms and Root Causes
Symptom: Connection Timeout
What You See
$ kubectl exec pod-a -- curl pod-b:80
curl: (28) Failed to connect to pod-b port 80 after 30000 ms: Timeout was reached Possible Causes
- NetworkPolicy blocking the connection
- Pod labels don't match selectors
- Wrong namespace selector
- Missing egress policy on source pod
- Missing ingress policy on destination pod
Symptom: Connection Refused
What You See
$ kubectl exec pod-a -- curl pod-b:80
curl: (7) Failed to connect to pod-b port 80: Connection refused Possible Causes
- NOT a NetworkPolicy issue (connection reached the pod)
- Target pod not listening on that port
- Service misconfigured
- Application not running
Symptom: DNS Resolution Fails
What You See
$ kubectl exec pod-a -- curl google.com
curl: (6) Could not resolve host: google.com Possible Causes
- Egress policy blocking DNS (port 53 UDP) to kube-dns
- Missing egress rule for kube-system namespace
Step-by-Step Diagnosis Process
Step 1: Verify CNI Plugin Supports NetworkPolicy
Check Your CNI
# Check which CNI you're using
kubectl get pods -n kube-system | grep -E 'calico|cilium|weave|flannel'
# CNI NetworkPolicy Support:
# ✅ Calico - Full support
# ✅ Cilium - Full support
# ✅ Weave Net - Full support
# ❌ Flannel - NO support
# ❌ Amazon VPC CNI (EKS) - NO support (use Calico or security groups) If your CNI doesn't support NetworkPolicy: Policies will be ignored silently. Switch to Calico/Cilium or use cloud-native alternatives.
Step 2: Check if Pods Have Labels
List Pod Labels
# Check labels on your pods
kubectl get pods -n your-namespace --show-labels
# Get specific pod labels
kubectl get pod your-pod -n your-namespace -o jsonpath='{.metadata.labels}' | jq
# Example output:
# {
# "app": "web",
# "version": "v1",
# "tier": "frontend"
# } Step 3: List All NetworkPolicies Affecting a Pod
Find Policies Selecting Your Pod
# Get all NetworkPolicies in namespace
kubectl get networkpolicies -n your-namespace
# Describe specific policy
kubectl describe networkpolicy policy-name -n your-namespace
# Check which pods are selected by a policy
kubectl get pods -n your-namespace -l app=web # Match podSelector Step 4: Test Connectivity Step by Step
Test Pod-to-Pod Connection
# Get target pod IP
kubectl get pod target-pod -n target-namespace -o wide
# Test from source pod
kubectl exec source-pod -n source-namespace -- curl -v target-pod-ip:80 --max-time 5
# Test DNS resolution
kubectl exec source-pod -n source-namespace -- nslookup target-service.target-namespace.svc.cluster.local
# Test specific port with netcat
kubectl exec source-pod -n source-namespace -- nc -zv target-pod-ip 80 Step 5: Check for Default Deny Policies
Default Deny Patterns
# Look for policies with empty ingress/egress
kubectl get networkpolicy -A -o yaml | grep -A 20 "podSelector: {}"
# Common default-deny patterns:
# 1. Deny all ingress
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny-ingress
spec:
podSelector: {} # Selects all pods
policyTypes:
- Ingress
# 2. Deny all egress
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny-egress
spec:
podSelector: {}
policyTypes:
- Egress Testing NetworkPolicy Configurations
Test Pod Creation
Create Test Pods
# Create source pod with curl
kubectl run test-source --image=nicolaka/netshoot -n default -- sleep 3600
# Create target pod (nginx)
kubectl run test-target --image=nginx --labels="app=web" -n default
# Get target pod IP
TARGET_IP=$(kubectl get pod test-target -o jsonpath='{.status.podIP}')
# Test connection
kubectl exec test-source -- curl -v $TARGET_IP --max-time 5 Test with netshoot Container
netshoot Has All Debugging Tools
# Run netshoot pod
kubectl run netshoot --rm -it --image=nicolaka/netshoot -- bash
# Inside netshoot:
# Test connectivity
curl -v target-service:80
# Check DNS
nslookup target-service.namespace.svc.cluster.local
# TCP port test
nc -zv target-ip 80
# UDP port test (DNS)
nc -zuv 10.96.0.10 53
# Trace route
traceroute target-ip
# Check HTTP response
curl -I http://target-service:80 Test DNS Resolution
Common DNS Issue: Egress to kube-dns Blocked
# Test DNS from pod
kubectl exec source-pod -- nslookup kubernetes.default
# If DNS fails, check egress to kube-system
# You need this egress rule:
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-dns
spec:
podSelector:
matchLabels:
app: myapp
egress:
- to:
- namespaceSelector:
matchLabels:
name: kube-system
ports:
- protocol: UDP
port: 53
- protocol: TCP
port: 53 Common NetworkPolicy Problems and Fixes
Problem 1: Forgot to Allow DNS
Symptom
Pods can't resolve domain names after applying egress policy.
Fix
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-dns
spec:
podSelector:
matchLabels:
app: myapp
policyTypes:
- Egress
egress:
# Allow DNS
- to:
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: kube-system
ports:
- protocol: UDP
port: 53
- protocol: TCP
port: 53
# Allow other egress
- to:
- podSelector:
matchLabels:
app: database
ports:
- protocol: TCP
port: 5432 Problem 2: Cross-Namespace Communication Fails
Symptom
Pod in namespace-a can't reach pod in namespace-b.
Fix
# Ingress policy on destination pod (namespace-b)
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-from-namespace-a
namespace: namespace-b
spec:
podSelector:
matchLabels:
app: backend
ingress:
- from:
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: namespace-a
podSelector:
matchLabels:
app: frontend
ports:
- protocol: TCP
port: 8080 Important: Namespace label must exist!
# Label namespace if needed
kubectl label namespace namespace-a kubernetes.io/metadata.name=namespace-a Problem 3: Forgot Both Ingress AND Egress
Symptom
Even though ingress is allowed, connection still times out.
Root Cause
You allowed INGRESS on the destination, but forgot EGRESS on the source pod.
Fix
# Source pod needs egress rule
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: frontend-egress
namespace: frontend-ns
spec:
podSelector:
matchLabels:
app: frontend
policyTypes:
- Egress
egress:
- to:
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: backend-ns
podSelector:
matchLabels:
app: backend
ports:
- protocol: TCP
port: 8080
# Destination pod needs ingress rule
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: backend-ingress
namespace: backend-ns
spec:
podSelector:
matchLabels:
app: backend
policyTypes:
- Ingress
ingress:
- from:
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: frontend-ns
podSelector:
matchLabels:
app: frontend
ports:
- protocol: TCP
port: 8080 Problem 4: Wrong Label Selector
Symptom
Policy exists but doesn't seem to apply.
Root Cause
Pod labels don't match the podSelector in NetworkPolicy.
Debug
# Check pod labels
kubectl get pod target-pod -o jsonpath='{.metadata.labels}' | jq
# Check NetworkPolicy selector
kubectl get networkpolicy my-policy -o yaml | grep -A 5 podSelector
# Common mistakes:
# Policy: app=web Pod: app=webserver (no match)
# Policy: tier=frontend Pod: role=frontend (wrong key)
# Policy: app=api Pod: app=API (case mismatch) Problem 5: Allowing External Traffic (Internet)
Symptom
Pod can't reach external APIs or websites.
Fix
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-external
spec:
podSelector:
matchLabels:
app: myapp
policyTypes:
- Egress
egress:
# Allow all external traffic
- to:
- ipBlock:
cidr: 0.0.0.0/0
except:
- 169.254.169.254/32 # Block AWS metadata
ports:
- protocol: TCP
port: 443 # HTTPS
- protocol: TCP
port: 80 # HTTP
# Allow DNS
- to:
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: kube-system
ports:
- protocol: UDP
port: 53 Essential Debugging Tools
1. kubectl exec with curl
# Basic connectivity test
kubectl exec source-pod -- curl -v http://target-service:80 --max-time 5
# Test specific IP and port
kubectl exec source-pod -- curl -v http://10.244.1.5:8080
# Silent test (just check exit code)
kubectl exec source-pod -- curl -f -s http://target-service:80 && echo "SUCCESS" || echo "FAIL" 2. netshoot Container
netshoot has all networking tools pre-installed (curl, dig, nc, tcpdump, etc.)
# Run as standalone pod
kubectl run netshoot --rm -it --image=nicolaka/netshoot -- bash
# Add netshoot as sidecar for existing pod
kubectl debug -it existing-pod --image=nicolaka/netshoot --target=container-name 3. kubectl exec with nc (netcat)
# TCP port test
kubectl exec source-pod -- nc -zv target-pod-ip 80
# UDP port test (DNS)
kubectl exec source-pod -- nc -zuv kube-dns.kube-system 53
# Listen on port (for reverse testing)
kubectl exec target-pod -- nc -l -p 8080 4. Cilium CLI (if using Cilium CNI)
# Install Cilium CLI
curl -L --remote-name-all https://github.com/cilium/cilium-cli/releases/latest/download/cilium-linux-amd64.tar.gz
sudo tar xzvfC cilium-linux-amd64.tar.gz /usr/local/bin
# Check connectivity
cilium connectivity test
# Monitor network traffic
cilium monitor --related-to default/pod-name 5. PolicyRule Visualization (inspektor-gadget)
# Install inspektor-gadget
kubectl gadget deploy
# Watch network policy events
kubectl gadget network-policy advisor --namespace default NetworkPolicy Debugging Flowchart
- Connection times out? Check NetworkPolicy is blocking it
- Connection refused? NOT a NetworkPolicy issue - check application
- DNS fails? Allow egress to kube-system port 53
- Cross-namespace fails? Check namespace labels and both ingress/egress
- Policy not working? Verify CNI supports NetworkPolicy
- Still stuck? Use netshoot pod + tcpdump to capture packets