advanced 36 min read aws Updated: 2025-06-25

AWS VPC Security Groups & Network Policies

Design and manage AWS VPC security groups, NACLs, and network segmentation policies for secure and compliant cloud networking.

📋 Prerequisites

Expert knowledge of VPC concepts (subnets, route tables, internet gateways, NAT gateways).
Advanced proficiency with IAM and networking concepts (CIDR notation, ports, protocols).
Strong experience with Terraform for deploying network infrastructure.
Familiarity with AWS Organizations and AWS Firewall Manager.

💡 Network Security: A Layered Defense

Effective network security in AWS is not about a single firewall; it's about a layered defense-in-depth strategy. **Security Groups** act as stateful firewalls at the instance level, while **Network ACLs (NACLs)** provide a stateless guardrail at the subnet level. Mastering the interplay between these two, along with centralized management and endpoint policies, is the key to building a secure and resilient network architecture.

What You'll Learn

Security Groups vs. NACLs: A Deep Dive
Advanced Security Group Design Patterns
Implementing a Defense-in-Depth NACL Strategy
Centralized Management with AWS Firewall Manager
Troubleshooting Common Network Connectivity Issues

🏷️ Topics Covered

aws vpc security group policiesaws network security best practicesaws vpc network segmentationaws security group automationaws nacl policy examplesaws vpc flow logs analysisaws network compliance monitoringaws vpc security architecture

Security Groups vs. NACLs: A Deep Dive

While both control traffic, their behavior and position in the network stack are fundamentally different. Understanding this is critical for proper design and troubleshooting.

🛡️ Security Groups (Stateful)

Scope: Acts at the ENI/instance level.
Rules: Supports "allow" rules only. Everything else is implicitly denied.
State: Stateful. If you allow inbound traffic, the return outbound traffic is automatically allowed.
Use Case: Primary firewall for your instances, allowing application-specific traffic.

🧱 Network ACLs (Stateless)

Scope: Acts at the subnet level.
Rules: Supports both "allow" and "deny" rules, evaluated in order.
State: Stateless. You must explicitly allow both inbound and outbound return traffic.
Use Case: A broad, blunt guardrail to block unwanted traffic from ever reaching your subnet.

Advanced Security Group Design Patterns

A key feature of Security Groups is their ability to reference other Security Groups as a source or destination. This allows you to create dynamic, tightly-scoped rules for multi-tier applications without hardcoding IP addresses.

Pattern: Three-Tier Web Application

In this classic architecture, we create three distinct security groups—one for each layer—that only allow traffic from the preceding layer.

🏗️ HCL: Multi-Tier Security Groups in Terraform

This Terraform code defines a secure, three-tier network model where only the Web layer is accessible from the internet, and the Database layer is only accessible from the App layer.

# 1. Web Tier Security Group (public facing)
resource "aws_security_group" "web_sg" {
  name   = "web-tier-sg"
  vpc_id = aws_vpc.main.id

  ingress {
    description = "Allow HTTP/S from anywhere"
    from_port   = 80
    to_port     = 443
    protocol    = "tcp"
    cidr_blocks = ["0.0.0.0/0"]
  }
}

# 2. App Tier Security Group (private)
resource "aws_security_group" "app_sg" {
  name   = "app-tier-sg"
  vpc_id = aws_vpc.main.id

  ingress {
    description     = "Allow traffic from the Web Tier"
    from_port       = 8080 # Application port
    to_port         = 8080
    protocol        = "tcp"
    security_groups = [aws_security_group.web_sg.id] # <-- Reference to Web SG
  }
}

# 3. DB Tier Security Group (private)
resource "aws_security_group" "db_sg" {
  name   = "db-tier-sg"
  vpc_id = aws_vpc.main.id

  ingress {
    description     = "Allow traffic from the App Tier"
    from_port       = 5432 # PostgreSQL port
    to_port         = 5432
    protocol        = "tcp"
    security_groups = [aws_security_group.app_sg.id] # <-- Reference to App SG
  }
}

Implementing a Defense-in-Depth NACL Strategy

NACLs are your first line of defense. While you should generally keep them open (as Security Groups provide finer control), they are extremely effective for blocking known bad actors or denying entire protocols at the subnet boundary.

🏗️ HCL: NACL for Blocking Bad IPs & Unwanted Protocols

This NACL explicitly denies traffic from a known malicious IP address and blocks all UDP traffic, regardless of Security Group rules.

resource "aws_network_acl" "main" {
  vpc_id = aws_vpc.main.id
  subnet_ids = [aws_subnet.private.id]

  # INBOUND RULES (evaluated in order)
  ingress {
    rule_number = 100
    protocol    = "tcp"
    rule_action = "deny"
    cidr_block  = "203.0.113.5/32" # Known malicious IP
    from_port   = 0
    to_port     = 0
  }

  ingress {
    rule_number = 200
    protocol    = "udp"
    rule_action = "deny"
    cidr_block  = "0.0.0.0/0"
    from_port   = 0
    to_port     = 0
  }

  # Remember to add allow rules for necessary traffic after the denies

  # OUTBOUND RULES (must allow return traffic)
  egress {
    rule_number = 100
    protocol    = "tcp"
    rule_action = "allow"
    cidr_block  = "0.0.0.0/0"
    from_port   = 1024
    to_port     = 65535
  }
}

Centralized Management with AWS Firewall Manager

Manually ensuring every new account and VPC has the correct baseline security groups is not scalable. **AWS Firewall Manager** is a security management service that allows you to centrally configure and deploy firewall rules across multiple accounts and resources in your AWS Organization.

🛡️ Firewall Manager Policy Types

You can create a **Firewall Manager security group policy** that defines a set of baseline rules (e.g., "all security groups must block inbound SSH from the internet"). Firewall Manager will then automatically audit all security groups in your organization and report any that are non-compliant.

Troubleshooting Common Network Connectivity Issues

Network access issues are common and can be complex to debug. Here's a systematic approach to solving them.

⏱️ Connection Timeout Error

Symptom: An application on Instance A cannot connect to a database on Instance B and the request times out.
Root Cause Checklist:
1. Security Group (Outbound): Does the Security Group for Instance A have an outbound rule allowing traffic to the database port (e.g., 5432) and destination (the IP or Security Group of Instance B)?
2. Security Group (Inbound): Does the Security Group for Instance B have an inbound rule allowing traffic on the database port from the source (the IP or Security Group of Instance A)?
3. Network ACLs: Check the NACLs on both the source and destination subnets. Is there an inbound rule on the destination subnet's NACL that allows the traffic? Is there an outbound rule on the source subnet's NACL that allows it? *Crucially, is there an outbound rule on the destination NACL and an inbound rule on the source NACL to allow the return traffic?*

🚫 "Connection Refused" Error

Symptom: An application gets an immediate "connection refused" response, not a timeout.
Cause: This typically means the network path is open (Security Groups and NACLs are likely correct), but there is no process listening on the destination port on the target instance.
Solution: Verify that the application (e.g., the database server) is running on the target instance and is configured to listen on the correct port and network interface.

🔑 Expert-Level VPC Security Best Practices

Reference Security Groups, Not IPs: For internal traffic between tiers, always reference other security groups as the source. This is more secure and dynamically adapts as instances are added or removed.
Keep NACLs Simple: Use NACLs as a broad shield. For most subnets, the default "ALLOW ALL" is fine. Only use custom NACLs to block specific, known-bad traffic at the edge.
Use Firewall Manager for Baselines: Define your mandatory organization-wide rules (e.g., no public SSH) in a Firewall Manager policy and apply it to all accounts to ensure a consistent baseline.
Automate Everything with IaC: All VPCs, subnets, route tables, security groups, and NACLs should be defined as code in Terraform for auditability and repeatability.

You've Secured Your Network Foundation!

A secure network is the backbone of your AWS environment. Connect it with these other core governance pillars.