Cloud Networking Fundamentals: How a VPC Actually Works
Networking is the wall most beginners hit. This is the from-scratch mental model, VPCs, subnets, route tables, gateways, and security groups, explained so clearly you could build one today. With a diagram, real Terraform, and the mistakes that cost people hours.
You can spin up a server in the cloud in two minutes. Then you try to reach it from the internet, and nothing works. Or your app can't talk to its database. Or it works, but a security review later finds your database was open to the entire planet. Every one of these is a networking problem, and networking is where most people's cloud confidence quietly falls apart.
The good news: the cloud network model is built from about six concepts. Once they click, every provider, AWS, Azure, GCP, looks the same with different names. This article builds that model from zero. By the end you'll understand what a VPC is, draw one from memory, and provision a real one with Terraform.
Who this is for
Total beginners welcome. If you know what an IP address is, you have enough to follow along. We use AWS names because they're the most common, but the concepts transfer directly to Azure VNets and GCP VPCs.
The one-sentence definition of a VPC
A VPC (Virtual Private Cloud) is your own private, isolated network inside the cloud, a building you control, where you decide every room, every door, and who's allowed through each one.
When you create an AWS account, nothing you launch is floating loose on the public internet. It lives inside a VPC. The VPC is the outer boundary. Inside it, you carve up the space and control movement through it. The whole thing maps cleanly onto a building you already understand:
๐ข The buildingVPC
๐ช Rooms inside itSubnets
๐ชง Hallway signsRoute tables
๐ช Doors to outsideGateways
๐ Door guardsSecurity groups
Hold this picture in your head, every term below is just one of these five things.
That's the whole mental model: boundary โ rooms โ hallways โ doors โ guards. Let's see it as a picture, then take each piece apart.
The anatomy of a VPC
A standard single-VPC layout. Inbound traffic flows left-to-right through the internet gateway and load balancer into private app servers. The database is never reachable from the internet. App servers reach OUT (for updates, APIs) through the NAT gateway (dashed), but nothing reaches in.
That picture looks busy at first, so let's walk a single request through it, one hop at a time:
1
A user opens your site
Their request leaves the public internet and arrives at the Internet Gateway, the one and only public door into your VPC.
2
The gateway hands off to the load balancer
The load balancer sits in a public subnet. It's the only thing in your whole setup exposed to the world.
3
The load balancer forwards to an app server
App servers live in a private subnet with no public address. The internet cannot reach them directly, only the load balancer can.
4
The app queries the database
The database is in its own private subnet and accepts connections only from the app tier. It is never reachable from outside.
5
When the app needs the internet, it exits via NAT
To download a package or call an external API, the app's traffic goes out through the NAT Gateway, which lets connections OUT but never lets them IN.
Notice the asymmetry, because it's the whole point of cloud networking: inbound and outbound are controlled separately. The database accepts connections only from the app. The app accepts connections only from the load balancer. The load balancer is the single thing exposed to the world. That layering is what 'secure by design' actually looks like.
Subnets: carving the VPC into rooms
A VPC has an address range, written in CIDR notation, for example 10.0.0.0/16. Don't let CIDR intimidate you; the only number that matters is the one after the slash. The smaller it is, the more addresses you have. A /16 gives you ~65,000 addresses; a /24 gives you 256. The VPC owns a big range, and each subnet takes a slice of it.
addressing-plan.txt
text
VPC 10.0.0.0/16 (~65,536 addresses, the whole building)
โโ public 10.0.1.0/24 (256 addresses, load balancer, NAT)
โโ app 10.0.10.0/24 (256 addresses, private app servers)
โโ data 10.0.20.0/24 (256 addresses, private database)
The single most important distinction in cloud networking is public subnet vs private subnet, and here's the secret: there is no checkbox that says "public." A subnet is public *only because* its route table sends internet-bound traffic to an internet gateway. Change that one route and the same subnet becomes private. The subnet doesn't decide; the routing does.
Pro tip
Rule of thumb: put anything that must accept traffic from the internet (load balancers, bastion hosts) in a public subnet. Put everything else, app servers, databases, caches, in private subnets. Default to private. You can always add a door later; you can't un-leak a database.
Route tables: the GPS of your network
A route table is a list of rules that answers one question for every packet: "this traffic is headed to address X, where do I send it?" Each subnet is associated with exactly one route table. Here's what makes a subnet public:
public-subnet-routes
text
Destination Target Meaning
10.0.0.0/16 local stay inside the VPC
0.0.0.0/0 igw-0abc123 everything else โ internet gateway
That second line, 0.0.0.0/0 (meaning "any address anywhere") pointing at the internet gateway, is the entire difference between public and private. A private subnet's route table sends 0.0.0.0/0 to a NAT gateway instead (outbound only), or has no internet route at all.
Internet Gateway vs NAT Gateway
These two get confused constantly. The fastest way to keep them straight: an Internet Gateway is a two-way door, a NAT Gateway is a one-way valve.
Internet Gateway
NAT Gateway
Direction
Two-way (in AND out)
One-way (out only)
Used by
Public subnets
Private subnets
Who can start a connection
Internet โ you, and you โ internet
Only you โ internet
Cost
Free
~$32/mo each + data charges
Typical resident
Load balancer, bastion host
App servers that need updates/APIs
Both connect to the internet, but in opposite directions, for opposite reasons.
This line item surprises people
NAT gateways bill ~$32/month each just to exist, before data charges. Teams often run one per availability zone for resilience and are then shocked by a $100+/month networking bill on an otherwise tiny setup. For dev environments, a single NAT gateway (or none, if nothing needs egress) is plenty.
Security Groups vs NACLs: the two firewalls
Routing decides where traffic *can* go. Firewalls decide what's *allowed*. The cloud gives you two layers, and knowing which to reach for is a classic interview question. Here's the contrast at a glance:
Security Group
Network ACL
Wraps
A resource (server, database)
A whole subnet
State
Stateful, replies auto-allowed
Stateless, write both directions
Rules
Allow-only
Allow and deny, in number order
Can reference other groups
Yes (e.g. "allow from app SG")
No, IP ranges only
Reach for it when
Almost always
Coarse subnet-wide deny
The 90% rule: use security groups for almost everything; reach for NACLs only for broad, subnet-wide blocks.
Security Groups, the guard at each instance
A security group wraps a resource (a server, a database) and controls its traffic. Two things make them beginner-friendly: they're stateful (if you allow a request in, the response is automatically allowed back out, you don't write return rules), and they're allow-only (you list what's permitted; everything else is denied by default).
The elegant part is that security groups can reference *each other*. Instead of "allow port 5432 from 10.0.10.0/24," you say "allow port 5432 from the app security group." Now any server in the app tier can reach the database, no matter its IP, and nothing else can.
NACLs, the guard at the room's doorway
Network ACLs operate at the subnet level instead of the resource level. They're stateless (you must write both inbound and outbound rules) and they're evaluated by numbered order. Most teams leave NACLs at their default "allow all" and do all their real work with security groups. Reach for NACLs when you need a coarse, subnet-wide deny, like blocking a known-bad IP range across an entire tier.
Pro tip
The 90% answer: use security groups for almost everything. They're stateful, composable, and hard to misconfigure. Touch NACLs only when you specifically need a broad, subnet-level block that a security group can't express.
Let's build one with Terraform
Concepts stick when you provision them yourself. Here's a minimal-but-real VPC with one public and one private subnet. Read it top-to-bottom, every resource maps to something we just covered.
main.tf
hcl
# The building: a /16 VPC with ~65k addressesresource"aws_vpc""main" {
cidr_block = "10.0.0.0/16"
enable_dns_hostnames = true
tags = { Name = "learn-vpc" }
}
# The public doorresource"aws_internet_gateway""igw" {
vpc_id = aws_vpc.main.id
tags = { Name = "learn-igw" }
}
# A public room (load balancer lives here)resource"aws_subnet""public" {
vpc_id = aws_vpc.main.id
cidr_block = "10.0.1.0/24"
map_public_ip_on_launch = true
availability_zone = "eu-west-1a"
tags = { Name = "public-a" }
}
# A private room (app + db live here)resource"aws_subnet""private" {
vpc_id = aws_vpc.main.id
cidr_block = "10.0.10.0/24"
availability_zone = "eu-west-1a"
tags = { Name = "private-a" }
}
So far we have a boundary, two rooms, and a door. But the public subnet isn't actually public yet, remember, that's decided by routing. This is the part beginners forget, and then wonder why their server is unreachable:
routing.tf
hcl
# Route table that sends "everywhere" to the internet gatewayresource"aws_route_table""public" {
vpc_id = aws_vpc.main.id
route {
cidr_block = "0.0.0.0/0"# any destination
gateway_id = aws_internet_gateway.igw.id
}
tags = { Name = "public-rt" }
}
# THIS is the line that makes the subnet publicresource"aws_route_table_association""public" {
subnet_id = aws_subnet.public.id
route_table_id = aws_route_table.public.id
}
And a security group that allows HTTPS in from the world but nothing else, note we never write an outbound rule for the response, because security groups are stateful:
Provision it and inspect what you built. Reading your network back from the CLI is a skill worth practising, it's how you'll debug every connectivity issue for the rest of your career.
verify.sh
bash
# Create everything
terraform init
terraform apply
# List your VPCs and their address ranges
aws ec2 describe-vpcs \
--query 'Vpcs[].{Id:VpcId,Cidr:CidrBlock,Name:Tags[?Key==`Name`]|[0].Value}' \
--output table
# Confirm which subnet routes to the internet gateway
aws ec2 describe-route-tables \
--filters "Name=vpc-id,Values=<your-vpc-id>" \
--query 'RouteTables[].Routes[?GatewayId!=`local`]'
Pro tip
If a resource is unreachable, debug in this order every time: (1) Is it in a subnet whose route table points 0.0.0.0/0 at an internet gateway? (2) Does its security group allow the port inbound? (3) Does the NACL allow it? 90% of "it won't connect" issues are step 1 or step 2.
Common mistakes that cost people hours
Forgetting the route table association. You create a public subnet and an internet gateway but never wire them together. The subnet is silently private. Nothing is reachable and there's no error to tell you why.
Opening security groups to 0.0.0.0/0 on the wrong port. SSH (22) or database (5432, 3306) open to the whole internet is the #1 finding in cloud security audits. Open those only to specific IPs or other security groups.
Overlapping CIDR ranges. If you ever want to connect two VPCs (peering, VPN), their address ranges must not overlap. Plan your 10.0.x.x blocks before you build, not after.
Running NAT gateways you don't need. If nothing in a private subnet needs to reach the internet, you don't need a NAT gateway burning money 24/7.
Putting databases in public subnets "just to make it work." It works, and then it's a breach. Databases belong in private subnets, reachable only from the app tier.
Where to go next
The whole article in 6 lines
A **VPC** is your private network in the cloud, the building everything lives in.
**Subnets** are rooms; a subnet is "public" only because its route table sends 0.0.0.0/0 to an internet gateway.
**Route tables** decide where traffic goes; **gateways** are the doors (IGW = two-way, NAT = out-only).
**Security groups** are stateful, per-resource, allow-only, and can reference each other, use them for almost everything.
Default everything to **private**. Only the load balancer should face the internet.
You can't un-leak a database, put data tiers in private subnets, reachable only from the app.
That foundation carries directly into load balancing, DNS, and multi-region design. To make it muscle memory rather than head-knowledge, do it hands-on:
Get your hands dirty in the browser: the Networking Lab and the Terraform Lab let you build and break this safely.
See where networking fits in the bigger picture: the Cloud Engineer path takes you from here through load balancers, DNS, and high availability.
Build one VPC by hand today. The concepts that felt abstract at the top of this article will feel obvious by tonight.
Want to go deeper?
This article covers concepts taught hands-on in the Cloud Engineer and DevOps career paths, with real terminal labs, production scenarios, and structured lessons.