Asymmetric return routes in a with a NAT in a VPC?

Question

To better understand how AWS VPCs (and NATs in general) work, I was reading this question where the goal was to have the following:

# GOAL
172.31.0.0/16   local
A.B.C.D/32      nat-451b3be9
0.0.0.0/0       igw-b4ac67d0

This is what intrigued me:

Note further that the configuration you're attempting will allow outbound, but never permit inbound connections (initiated from outside) from the A.B.C.D address to anything on this subnet, because the return route is asymmetric through the NAT gateway.

the NAT Gateway is not designed to be created on any subnet for which it provides NAT services. The instances reach external resources via their subnet's route table (points to NAT-GW for instances without public IP, points to IGW for instances with public IP) and the NAT-GW reaches the Internet via its subnet route table (points to IGW).

If an instance is using its own public IP, it must route responses out via the IGW because that's where the inbound traffic is coming from, and it can't try to leave via NAT-GW because the peer on the outside would see the reply coming from the wrong source IP if the traffic got translated.

I'm trying to understand exactly why it would allow outbound but not inbound traffic. Here's what I'm thinking: say that the EC2 instance has an elastic IP and is in a subnet with a routing table as the above. The NAT is then in a separate subnet. Say A.B.C.D initiated a connection to the EC2's elastic IP. Wouldn't the connection enter the VPC, the routing table would send it through the NAT, which would then go to the instance, and then back out through the NAT? However, since it got sent out through the NAT, the address got translated (as said above) and the peer would drop the packet since it didn't come from the IP of the EC2 instance. Is this the correct understanding? It will still reach the EC2 instance, but response packets will never be received by A.B.C.D?

score 1 · Answer 1 · answered Aug 16 '19 at 11:06

The internet gateway logically provides address translation on behalf of your instance, so that when traffic leaves your VPC subnet and goes to the internet, the reply address field is set to the public IPv4 address or Elastic IP address of your instance, and not its private IP address. When traffic is entering the VPC, it translates the public IP address back to instance's private address and sends the request to the VPC router, which routes the packet to the instance.

For the outbound traffic, however, the VPC router selects the most specific route and forwards the packets to the NAT GW, which does the address translation and sends the packets to the IGW (through the VPC router). IGW then replaces the destination address with the NAT GWs public IP. The peer will drop the connection as the IPs do not match.

If the connection is initiated by the peer:

Peer --> (EC2 public IP) --> IGW --> (EC2 private IP) --> EC2

If the connection is initiated by EC2:

EC2 --> (destination IP) --> NAT GW -1-> IGW -2-> Peer

In the second case address translation occurs twice:

EC2 private IP to NAT GW private IP
NAT GW private IP to NAT GW public IP.

score 1 · Accepted Answer · answered Aug 16 '19 at 18:27

Say A.B.C.D initiated a connection to the EC2's elastic IP. Wouldn't the connection enter the VPC, the routing table would send it through the NAT...

No, for two reasons:

VPC route tables are only concerned with destination addresses in packets -- not source addresses.
Inbound traffic doesn't actually consult a VPC route table. VPC route tables are attached to subnets, and are used to select a route for packets sourced from machines in the subnets attached to the route table, by looking up the destination gateway in the table.

When the traffic enters the VPC from outside, the Internet Gateway would translate the packet's destination address from instance's EIP to the instance's private IP, and no route table lookup is used -- the IGW sends the traffic directly to the instance.

Then the instance would reply, and the destination address of the external machine would -- via a VPC route table lookup -- result in sending the replies toward the NAT device, resulting in the asymmetry described.

To state it another way, the only inbound packets that get sent through a NAT Gateway are packets whose destination address is the EIP of the NAT Gateway.

Ah, that makes sense. Thanks for the response. So to make sure I’m understanding correctly, if A.B.C.D sends a SYN to the EIP of the EC2 instance, it won’t pass through the NAT on the way inbound to the instance, but on the ACK packet sent back out, it’ll pass through the NAT and thus the peer will drop it as it shows coming from the source IP of the NAT rather than the EC2 EIP? — rb612, Aug 16 '19 at 18:45
Correct, with one possible variation: it will *try* to pass through the NAT Gateway on the way out, but instead of translating it, the NAT Gateway might simply discard the packet, since it never saw the initial SYN and the packet corresponds to no established flow known to the NAT Gateway. Either way, the three-way handshake would not complete. — Michael - sqlbot, Aug 16 '19 at 18:54

score 0 · Answer 3 · answered Apr 18 '20 at 15:37

I had the exact same problem. It can be prevented using a load balancer which is not using the same route table as your application servers.

For example, if you want your application servers to route requests to external services through a NAT gateway, introduce a new route table that is only used by the app servers.

# app server route-table
172.31.0.0/16   local
A.B.C.D/32      nat-451b3be9
0.0.0.0/0       igw-b4ac67d0

Outgoing requests from your app servers will now be routed through the NAT gateway for IP A.B.C.D.

Incoming requests to your app servers should be routed through a load balancer. The load balancer (and probably the rest of your infrastructure) should use the default route table that doesn't reference the NAT gateway:

# default route-table
172.31.0.0/16   local
0.0.0.0/0       igw-b4ac67d0

Any incoming requests will then be routed as follows: Remote host -> load balancer -> app server -> load balancer (as it's using the LB's IP address which is local) -> remote host - without going through the NAT gateway.

Asymmetric return routes in a with a NAT in a VPC?

3 Answers3