intermediate 25 min read troubleshooting-hub Updated: 2025-10-11

Debugging Kubernetes NetworkPolicy: Complete Troubleshooting Guide (2025)

A comprehensive guide to debugging Kubernetes NetworkPolicy issues. Learn how to troubleshoot connection failures, test policies, and fix common NetworkPolicy configuration problems.

Quick Diagnosis Checklist

  • Is your CNI plugin NetworkPolicy-compatible? (Calico, Cilium, Weave work - Flannel doesn't)
  • Are pod labels correct and matching selectors?
  • Is the default-deny policy blocking unintended traffic?
  • Are namespace selectors configured correctly?

Understanding How NetworkPolicy Works

Kubernetes NetworkPolicy controls traffic flow between pods. Understanding the core concepts is critical for debugging.

Key Concepts

🎯

Pod Selection

NetworkPolicy uses podSelector to target specific pods by labels.

🚪

Ingress/Egress

Ingress controls incoming traffic, Egress controls outgoing traffic.

Default Deny

Once a pod is selected by ANY NetworkPolicy, all traffic is denied except what's explicitly allowed.

Critical Rule: Default Deny Behavior

THIS IS THE MOST COMMON SOURCE OF CONFUSION

As soon as ONE NetworkPolicy selects a pod:

  • ALL traffic to that pod is denied by default
  • You must explicitly allow each traffic flow you want
  • Multiple policies are additive (OR logic)

Example Scenario

# You create this policy to allow port 80
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-http
spec:
  podSelector:
    matchLabels:
      app: web
  ingress:
  - from:
    - podSelector:
        matchLabels:
          app: frontend
    ports:
    - protocol: TCP
      port: 80

# Result: Port 80 is allowed, but EVERYTHING ELSE is blocked
# - Port 443? BLOCKED
# - Port 8080? BLOCKED
# - Traffic from other pods? BLOCKED
# - Egress traffic? BLOCKED (if no egress rule exists)

Common Symptoms and Root Causes

Symptom: Connection Timeout

What You See

$ kubectl exec pod-a -- curl pod-b:80
curl: (28) Failed to connect to pod-b port 80 after 30000 ms: Timeout was reached

Possible Causes

  • NetworkPolicy blocking the connection
  • Pod labels don't match selectors
  • Wrong namespace selector
  • Missing egress policy on source pod
  • Missing ingress policy on destination pod

Symptom: Connection Refused

What You See

$ kubectl exec pod-a -- curl pod-b:80
curl: (7) Failed to connect to pod-b port 80: Connection refused

Possible Causes

  • NOT a NetworkPolicy issue (connection reached the pod)
  • Target pod not listening on that port
  • Service misconfigured
  • Application not running

Symptom: DNS Resolution Fails

What You See

$ kubectl exec pod-a -- curl google.com
curl: (6) Could not resolve host: google.com

Possible Causes

  • Egress policy blocking DNS (port 53 UDP) to kube-dns
  • Missing egress rule for kube-system namespace

Step-by-Step Diagnosis Process

Step 1: Verify CNI Plugin Supports NetworkPolicy

Check Your CNI

# Check which CNI you're using
kubectl get pods -n kube-system | grep -E 'calico|cilium|weave|flannel'

# CNI NetworkPolicy Support:
# ✅ Calico - Full support
# ✅ Cilium - Full support
# ✅ Weave Net - Full support
# ❌ Flannel - NO support
# ❌ Amazon VPC CNI (EKS) - NO support (use Calico or security groups)

If your CNI doesn't support NetworkPolicy: Policies will be ignored silently. Switch to Calico/Cilium or use cloud-native alternatives.

Step 2: Check if Pods Have Labels

List Pod Labels

# Check labels on your pods
kubectl get pods -n your-namespace --show-labels

# Get specific pod labels
kubectl get pod your-pod -n your-namespace -o jsonpath='{.metadata.labels}' | jq

# Example output:
# {
#   "app": "web",
#   "version": "v1",
#   "tier": "frontend"
# }

Step 3: List All NetworkPolicies Affecting a Pod

Find Policies Selecting Your Pod

# Get all NetworkPolicies in namespace
kubectl get networkpolicies -n your-namespace

# Describe specific policy
kubectl describe networkpolicy policy-name -n your-namespace

# Check which pods are selected by a policy
kubectl get pods -n your-namespace -l app=web  # Match podSelector

Step 4: Test Connectivity Step by Step

Test Pod-to-Pod Connection

# Get target pod IP
kubectl get pod target-pod -n target-namespace -o wide

# Test from source pod
kubectl exec source-pod -n source-namespace -- curl -v target-pod-ip:80 --max-time 5

# Test DNS resolution
kubectl exec source-pod -n source-namespace -- nslookup target-service.target-namespace.svc.cluster.local

# Test specific port with netcat
kubectl exec source-pod -n source-namespace -- nc -zv target-pod-ip 80

Step 5: Check for Default Deny Policies

Default Deny Patterns

# Look for policies with empty ingress/egress
kubectl get networkpolicy -A -o yaml | grep -A 20 "podSelector: {}"

# Common default-deny patterns:
# 1. Deny all ingress
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: default-deny-ingress
spec:
  podSelector: {}  # Selects all pods
  policyTypes:
  - Ingress

# 2. Deny all egress
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: default-deny-egress
spec:
  podSelector: {}
  policyTypes:
  - Egress

Testing NetworkPolicy Configurations

Test Pod Creation

Create Test Pods

# Create source pod with curl
kubectl run test-source --image=nicolaka/netshoot -n default -- sleep 3600

# Create target pod (nginx)
kubectl run test-target --image=nginx --labels="app=web" -n default

# Get target pod IP
TARGET_IP=$(kubectl get pod test-target -o jsonpath='{.status.podIP}')

# Test connection
kubectl exec test-source -- curl -v $TARGET_IP --max-time 5

Test with netshoot Container

netshoot Has All Debugging Tools

# Run netshoot pod
kubectl run netshoot --rm -it --image=nicolaka/netshoot -- bash

# Inside netshoot:
# Test connectivity
curl -v target-service:80

# Check DNS
nslookup target-service.namespace.svc.cluster.local

# TCP port test
nc -zv target-ip 80

# UDP port test (DNS)
nc -zuv 10.96.0.10 53

# Trace route
traceroute target-ip

# Check HTTP response
curl -I http://target-service:80

Test DNS Resolution

Common DNS Issue: Egress to kube-dns Blocked

# Test DNS from pod
kubectl exec source-pod -- nslookup kubernetes.default

# If DNS fails, check egress to kube-system
# You need this egress rule:
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-dns
spec:
  podSelector:
    matchLabels:
      app: myapp
  egress:
  - to:
    - namespaceSelector:
        matchLabels:
          name: kube-system
    ports:
    - protocol: UDP
      port: 53
    - protocol: TCP
      port: 53

Common NetworkPolicy Problems and Fixes

Problem 1: Forgot to Allow DNS

Symptom

Pods can't resolve domain names after applying egress policy.

Fix

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-dns
spec:
  podSelector:
    matchLabels:
      app: myapp
  policyTypes:
  - Egress
  egress:
  # Allow DNS
  - to:
    - namespaceSelector:
        matchLabels:
          kubernetes.io/metadata.name: kube-system
    ports:
    - protocol: UDP
      port: 53
    - protocol: TCP
      port: 53
  # Allow other egress
  - to:
    - podSelector:
        matchLabels:
          app: database
    ports:
    - protocol: TCP
      port: 5432

Problem 2: Cross-Namespace Communication Fails

Symptom

Pod in namespace-a can't reach pod in namespace-b.

Fix

# Ingress policy on destination pod (namespace-b)
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-from-namespace-a
  namespace: namespace-b
spec:
  podSelector:
    matchLabels:
      app: backend
  ingress:
  - from:
    - namespaceSelector:
        matchLabels:
          kubernetes.io/metadata.name: namespace-a
      podSelector:
        matchLabels:
          app: frontend
    ports:
    - protocol: TCP
      port: 8080

Important: Namespace label must exist!

# Label namespace if needed
kubectl label namespace namespace-a kubernetes.io/metadata.name=namespace-a

Problem 3: Forgot Both Ingress AND Egress

Symptom

Even though ingress is allowed, connection still times out.

Root Cause

You allowed INGRESS on the destination, but forgot EGRESS on the source pod.

Fix

# Source pod needs egress rule
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: frontend-egress
  namespace: frontend-ns
spec:
  podSelector:
    matchLabels:
      app: frontend
  policyTypes:
  - Egress
  egress:
  - to:
    - namespaceSelector:
        matchLabels:
          kubernetes.io/metadata.name: backend-ns
      podSelector:
        matchLabels:
          app: backend
    ports:
    - protocol: TCP
      port: 8080

# Destination pod needs ingress rule
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: backend-ingress
  namespace: backend-ns
spec:
  podSelector:
    matchLabels:
      app: backend
  policyTypes:
  - Ingress
  ingress:
  - from:
    - namespaceSelector:
        matchLabels:
          kubernetes.io/metadata.name: frontend-ns
      podSelector:
        matchLabels:
          app: frontend
    ports:
    - protocol: TCP
      port: 8080

Problem 4: Wrong Label Selector

Symptom

Policy exists but doesn't seem to apply.

Root Cause

Pod labels don't match the podSelector in NetworkPolicy.

Debug

# Check pod labels
kubectl get pod target-pod -o jsonpath='{.metadata.labels}' | jq

# Check NetworkPolicy selector
kubectl get networkpolicy my-policy -o yaml | grep -A 5 podSelector

# Common mistakes:
# Policy: app=web        Pod: app=webserver  (no match)
# Policy: tier=frontend  Pod: role=frontend  (wrong key)
# Policy: app=api        Pod: app=API        (case mismatch)

Problem 5: Allowing External Traffic (Internet)

Symptom

Pod can't reach external APIs or websites.

Fix

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-external
spec:
  podSelector:
    matchLabels:
      app: myapp
  policyTypes:
  - Egress
  egress:
  # Allow all external traffic
  - to:
    - ipBlock:
        cidr: 0.0.0.0/0
        except:
        - 169.254.169.254/32  # Block AWS metadata
    ports:
    - protocol: TCP
      port: 443  # HTTPS
    - protocol: TCP
      port: 80   # HTTP
  # Allow DNS
  - to:
    - namespaceSelector:
        matchLabels:
          kubernetes.io/metadata.name: kube-system
    ports:
    - protocol: UDP
      port: 53

Essential Debugging Tools

1. kubectl exec with curl

# Basic connectivity test
kubectl exec source-pod -- curl -v http://target-service:80 --max-time 5

# Test specific IP and port
kubectl exec source-pod -- curl -v http://10.244.1.5:8080

# Silent test (just check exit code)
kubectl exec source-pod -- curl -f -s http://target-service:80 && echo "SUCCESS" || echo "FAIL"

2. netshoot Container

netshoot has all networking tools pre-installed (curl, dig, nc, tcpdump, etc.)

# Run as standalone pod
kubectl run netshoot --rm -it --image=nicolaka/netshoot -- bash

# Add netshoot as sidecar for existing pod
kubectl debug -it existing-pod --image=nicolaka/netshoot --target=container-name

3. kubectl exec with nc (netcat)

# TCP port test
kubectl exec source-pod -- nc -zv target-pod-ip 80

# UDP port test (DNS)
kubectl exec source-pod -- nc -zuv kube-dns.kube-system 53

# Listen on port (for reverse testing)
kubectl exec target-pod -- nc -l -p 8080

4. Cilium CLI (if using Cilium CNI)

# Install Cilium CLI
curl -L --remote-name-all https://github.com/cilium/cilium-cli/releases/latest/download/cilium-linux-amd64.tar.gz
sudo tar xzvfC cilium-linux-amd64.tar.gz /usr/local/bin

# Check connectivity
cilium connectivity test

# Monitor network traffic
cilium monitor --related-to default/pod-name

5. PolicyRule Visualization (inspektor-gadget)

# Install inspektor-gadget
kubectl gadget deploy

# Watch network policy events
kubectl gadget network-policy advisor --namespace default

NetworkPolicy Debugging Flowchart

  1. Connection times out? Check NetworkPolicy is blocking it
  2. Connection refused? NOT a NetworkPolicy issue - check application
  3. DNS fails? Allow egress to kube-system port 53
  4. Cross-namespace fails? Check namespace labels and both ingress/egress
  5. Policy not working? Verify CNI supports NetworkPolicy
  6. Still stuck? Use netshoot pod + tcpdump to capture packets