Environment: Amazon EC2 m4.4xlarge, running Amazon Linux 2 AMI 2.0
I would like to use iptables to load balance https requests between a set of elastic IPs that are assigned to multiple Amazon elastic network interfaces attached to the same instance.
I have a load balancer working for a single network interface using a SNAT post routing scheme that round-robin's between all the IP addresses that are assigned to the instance. The nat table looks something like
Chain POSTROUTING (policy ACCEPT)
target prot opt source destination
SNAT tcp -- anywhere anywhere statistic mode nth every 5 tcp dpt:https to:xxx.xxx.xxx.xx5
SNAT tcp -- anywhere anywhere statistic mode nth every 4 tcp dpt:https to:xxx.xxx.xxx.xx4
SNAT tcp -- anywhere anywhere statistic mode nth every 3 tcp dpt:https to:xxx.xxx.xxx.xx3
SNAT tcp -- anywhere anywhere statistic mode nth every 2 tcp dpt:https to:xxx.xxx.xxx.xx2
So far the approach I have taken to balancing between multiple IPs has been to mark packets in the output mangle table, policy route to network interfaces based off of the firewall mark, and then round robin on the interfaces. Here are the tables:
MANGLE:
Chain OUTPUT (policy ACCEPT 1518K packets, 63M bytes)
pkts bytes target prot opt in out source destination
21 6007 CONNMARK tcp -- any any anywhere anywhere tcp dpt:https CONNMARK restore
4 240 MARK tcp -- any any anywhere anywhere tcp dpt:https connmark match 0x1 MARK set 0x1
4 240 MARK tcp -- any any anywhere anywhere tcp dpt:https connmark match 0x2 MARK set 0x2
4 240 MARK tcp -- any any anywhere anywhere tcp dpt:https connmark match 0x3 MARK set 0x3
12 720 RETURN tcp -- any any anywhere anywhere tcp dpt:https connmark match ! 0x0
3 180 MARK tcp -- any any anywhere anywhere statistic mode nth every 3 tcp dpt:https MARK set 0x3
3 180 CONNMARK tcp -- any any anywhere anywhere statistic mode nth every 3 tcp dpt:https CONNMARK set 0x3
3 180 CONNMARK tcp -- any any anywhere anywhere statistic mode nth every 3 tcp dpt:https CONNMARK save
3 180 RETURN tcp -- any any anywhere anywhere statistic mode nth every 3 tcp dpt:https
3 180 MARK tcp -- any any anywhere anywhere statistic mode nth every 2 tcp dpt:https MARK set 0x2
3 180 CONNMARK tcp -- any any anywhere anywhere statistic mode nth every 2 tcp dpt:https CONNMARK set 0x2
3 180 CONNMARK tcp -- any any anywhere anywhere statistic mode nth every 2 tcp dpt:https CONNMARK save
3 180 RETURN tcp -- any any anywhere anywhere statistic mode nth every 2 tcp dpt:https
3 180 MARK tcp -- any any anywhere anywhere statistic mode nth every 1 tcp dpt:https MARK set 0x1
3 180 CONNMARK tcp -- any any anywhere anywhere statistic mode nth every 1 tcp dpt:https CONNMARK set 0x1
3 180 CONNMARK tcp -- any any anywhere anywhere statistic mode nth every 1 tcp dpt:https CONNMARK save
3 180 RETURN tcp -- any any anywhere anywhere statistic mode nth every 1 tcp dpt:https
NAT
Chain POSTROUTING (policy ACCEPT 44028 packets, 3373K bytes)
pkts bytes target prot opt in out source destination
1 60 SNAT tcp -- any eth0 anywhere anywhere statistic mode nth every 2 tcp dpt:https to:xxx.xxx.xxx.xxx2
1 60 SNAT tcp -- any eth0 anywhere anywhere statistic mode nth every 3 tcp dpt:https to:xxx.xxx.xxx.xxx3
1 60 SNAT tcp -- any eth0 anywhere anywhere statistic mode nth every 4 tcp dpt:https to:xxx.xxx.xxx.xxx4
0 0 SNAT tcp -- any eth0 anywhere anywhere statistic mode nth every 5 tcp dpt:https to:xxx.xxx.xxx.xxx5
0 0 SNAT tcp -- any eth0 anywhere anywhere statistic mode nth every 6 tcp dpt:https to:xxx.xxx.xxx.xxx6
1 60 SNAT tcp -- any eth1 anywhere anywhere statistic mode nth every 2 tcp dpt:https to:xxx.xxx.xxx.xx12
1 60 SNAT tcp -- any eth1 anywhere anywhere statistic mode nth every 3 tcp dpt:https to:xxx.xxx.xxx.xx13
1 60 SNAT tcp -- any eth1 anywhere anywhere statistic mode nth every 4 tcp dpt:https to:xxx.xxx.xxx.xx14
0 0 SNAT tcp -- any eth1 anywhere anywhere statistic mode nth every 5 tcp dpt:https to:xxx.xxx.xxx.xx15
0 0 SNAT tcp -- any eth1 anywhere anywhere statistic mode nth every 6 tcp dpt:https to:xxx.xxx.xxx.xx16
1 60 SNAT tcp -- any eth2 anywhere anywhere statistic mode nth every 2 tcp dpt:https to:xxx.xxx.xxx.xx22
1 60 SNAT tcp -- any eth2 anywhere anywhere statistic mode nth every 3 tcp dpt:https to:xxx.xxx.xxx.xx23
1 60 SNAT tcp -- any eth2 anywhere anywhere statistic mode nth every 4 tcp dpt:https to:xxx.xxx.xxx.xx24
0 0 SNAT tcp -- any eth2 anywhere anywhere statistic mode nth every 5 tcp dpt:https to:xxx.xxx.xxx.xx25
0 0 SNAT tcp -- any eth2 anywhere anywhere statistic mode nth every 6 tcp dpt:https to:xxx.xxx.xxx.xx26
output of ip rule
:
0: from all lookup local
32501: from all fwmark 0x2 lookup if2
32501: from all fwmark 0x1 lookup if1
32602: from xxx.xxx.xxx.x12 lookup 10001
32603: from xxx.xxx.xxx.x13 lookup 10001
32604: from xxx.xxx.xxx.x14 lookup 10001
32605: from xxx.xxx.xxx.x15 lookup 10001
32606: from xxx.xxx.xxx.x16 lookup 10001
32612: from xxx.xxx.xxx.x22 lookup 10002
32613: from xxx.xxx.xxx.x23 lookup 10002
32614: from xxx.xxx.xxx.x24 lookup 10002
32615: from xxx.xxx.xxx.x25 lookup 10002
32616: from xxx.xxx.xxx.x26 lookup 10002
All the lookup 1000x
rules are added automatically by amazon when attaching the ENI. The fwmark
rules I added myself.
and if1 and if2 looks something like:
default via yyy.yyy.yyy.yyy dev eth2
yyy.yyy.yyy.0/20 dev eth2 scope link
The test I am running is hitting curl https://ifconfig.me
which works for ~1/3 of requests. I am guessing this is because we receive the packet on the default eth0
interface which knows how to properly deal with the packet.
The other 2/3s of requests are just hanging indefinitely. Its worth noting that the outgoing packets hit these tables and seem to do the right thing (namely the SNAT rules are changing the source IP addresses of outgoing packets). This has stumped me for some time.