1

I've got an issue which caused me quite a bit of a headache and found the solution, but I cannot understand why the problem was happening in the first place.

I've got an nginx reverse proxy which is routing requests to 1 of 3 servers. All are node servers, two of them are express and one is a custom built one.

The issue: when my firewall is up, the first request in the total uptime of the two express servers suffers a 504:Gateway Timeout. After that happens, the files are served quickly and without issue, even over the course of multiple days.

I was able to narrow the problem down to the firewall by noticing that the issue disappears when I bring down the firewall completely. My server is running FreeBSD, using PF as a firewall. Here is the relevant pf.conf:

iface = "vtnet0"
loopback = "lo0"
public_ip = "132.148.77.28/32"
localnet = "127.0.0.1"

nat on $iface from any to any -> $public_ip # added by cloud server provider
pass out quick on $iface proto { tcp udp } from port ntp keep state
block log all
pass in on $iface proto tcp from $localnet to port mail
pass in log on $iface proto tcp to $public_ip port { ssh http } keep state
pass in log on $loopback proto tcp to $localnet port { 5555 5556 5557 } keep state
pass out log all keep state

I understand that any traffic on the vtnet0 device will be translated to using the public-ip.

I ran a tcpdump with the following command: sudo tcpdump -c 10 -vvv -t -i pflog0 -e -n tcp

I captured 10 packets, with triple verbosity, suppressing the timestamp, on pflog0, printing the pf rule which allowed or blocked the packet, leaving ip addresses and ports as numbers, and only capturing traffic using tcp.

Here are the interesting results of the first request, along with the results of subsequent requests:

$ sudo tcpdump -c 10 -vvv -t -i pflog0 -e -n tcp
tcpdump: WARNING: pflog0: no IPv4 address assigned
tcpdump: listening on pflog0, link-type PFLOG (OpenBSD pflog file), capture size 65535 bytes

rule 8..16777216/0(match): pass in on vtnet0: (tos 0x0, ttl 52, id 53488, offset 0, flags [DF], proto TCP (6), length 60)
    67.197.156.119.45274 > 132.148.77.28.80: Flags [S], cksum 0x76df (correct), seq 356096480, win 29200, options [mss 1460,sackOK,TS val 63916304 ecr 0,nop,wscale 7], length 0

rule 9..16777216/0(match): pass out on lo0: (tos 0x0, ttl 64, id 2406, offset 0, flags [DF], proto TCP (6), length 60)
    127.0.0.1.28850 > 127.0.0.1.5556: Flags [S], cksum 0xfe30 (incorrect -> 0x180a), seq 2069090201, win 65535, options [mss 16344,nop,wscale 6,sackOK,TS val 33367168 ecr 0], length 0

rule 5..16777216/0(match): pass in on lo0: (tos 0x0, ttl 64, id 2406, offset 0, flags [DF], proto TCP (6), length 60, bad cksum 0 (->3354)!)
    127.0.0.1.28850 > 127.0.0.1.5556: Flags [S], cksum 0xfe30 (incorrect -> 0x180a), seq 2069090201, win 65535, options [mss 16344,nop,wscale 6,sackOK,TS val 33367168 ecr 0], length 0

rule 9..16777216/0(match): pass out on lo0: (tos 0x0, ttl 64, id 2414, offset 0, flags [DF], proto TCP (6), length 60)
    132.148.77.28.48488 > 132.148.77.28.80: Flags [S], cksum 0xa38f (incorrect -> 0x1728), seq 3973611725, win 65535, options [mss 16344,nop,wscale 6,sackOK,TS val 33367544 ecr 0], length 0

rule 2..16777216/0(match): **block in on lo0**: (tos 0x0, ttl 64, id 2414, offset 0, flags [DF], proto TCP (6), length 60, bad cksum 0 (->8ded)!)
    132.148.77.28.48488 > 132.148.77.28.80: Flags [S], cksum 0xa38f (incorrect -> 0x1728), seq 3973611725, win 65535, options [mss 16344,nop,wscale 6,sackOK,TS val 33367544 ecr 0], length 0

rule 2..16777216/0(match): **block in on lo0**: (tos 0x0, ttl 64, id 2416, offset 0, flags [DF], proto TCP (6), length 60, bad cksum 0 (->8deb)!)
    132.148.77.28.48488 > 132.148.77.28.80: Flags [S], cksum 0xa38f (incorrect -> 0x0b6a), seq 3973611725, win 65535, options [mss 16344,nop,wscale 6,sackOK,TS val 33370550 ecr 0], length 0

rule 2..16777216/0(match): **block in on lo0**: (tos 0x0, ttl 64, id 2418, offset 0, flags [DF], proto TCP (6), length 60, bad cksum 0 (->8de9)!)
    132.148.77.28.48488 > 132.148.77.28.80: Flags [S], cksum 0xa38f (incorrect -> 0xfee6), seq 3973611725, win 65535, options [mss 16344,nop,wscale 6,sackOK,TS val 33373753 ecr 0], length 0

rule 2..16777216/0(match): **block in on lo0**: (tos 0x0, ttl 64, id 2420, offset 0, flags [DF], proto TCP (6), length 60, bad cksum 0 (->8de7)!)
    132.148.77.28.48488 > 132.148.77.28.80: Flags [S], cksum 0xa38f (incorrect -> 0xf265), seq 3973611725, win 65535, options [mss 16344,nop,wscale 6,sackOK,TS val 33376954 ecr 0], length 0

rule 2..16777216/0(match): **block in on lo0**: (tos 0x0, ttl 64, id 2422, offset 0, flags [DF], proto TCP (6), length 60, bad cksum 0 (->8de5)!)
    132.148.77.28.48488 > 132.148.77.28.80: Flags [S], cksum 0xa38f (incorrect -> 0xe5c8), seq 3973611725, win 65535, options [mss 16344,nop,wscale 6,sackOK,TS val 33380183 ecr 0], length 0

rule 2..16777216/0(match): **block in on lo0**: (tos 0x0, ttl 64, id 2424, offset 0, flags [DF], proto TCP (6), length 60, bad cksum 0 (->8de3)!)
    132.148.77.28.48488 > 132.148.77.28.80: Flags [S], cksum 0xa38f (incorrect -> 0xd943), seq 3973611725, win 65535, options [mss 16344,nop,wscale 6,sackOK,TS val 33383388 ecr 0], length 0
10 packets captured
11 packets received by filter
0 packets dropped by kernel

$ sudo tcpdump -c 10 -vvv -t -i pflog0 -e -n tcp
tcpdump: WARNING: pflog0: no IPv4 address assigned
tcpdump: listening on pflog0, link-type PFLOG (OpenBSD pflog file), capture size 65535 bytes

rule 8..16777216/0(match): pass in on vtnet0: (tos 0x0, ttl 52, id 408, offset 0, flags [DF], proto TCP (6), length 60)
    67.197.156.119.45482 > 132.148.77.28.80: Flags [S], cksum 0x2e49 (correct), seq 3314040182, win 29200, options [mss 1460,sackOK,TS val 64109806 ecr 0,nop,wscale 7], length 0

rule 9..16777216/0(match): pass out on lo0: (tos 0x0, ttl 64, id 2446, offset 0, flags [DF], proto TCP (6), length 60)
    127.0.0.1.19450 > 127.0.0.1.5556: Flags [S], cksum 0xfe30 (incorrect -> 0x603e), seq 629617255, win 65535, options [mss 16344,nop,wscale 6,sackOK,TS val 34141175 ecr 0], length 0

rule 5..16777216/0(match): pass in on lo0: (tos 0x0, ttl 64, id 2446, offset 0, flags [DF], proto TCP (6), length 60, bad cksum 0 (->332c)!)
    127.0.0.1.19450 > 127.0.0.1.5556: Flags [S], cksum 0xfe30 (incorrect -> 0x603e), seq 629617255, win 65535, options [mss 16344,nop,wscale 6,sackOK,TS val 34141175 ecr 0], length 0

rule 9..16777216/0(match): pass out on lo0: (tos 0x0, ttl 64, id 2459, offset 0, flags [DF], proto TCP (6), length 60)
    127.0.0.1.54320 > 127.0.0.1.5556: Flags [S], cksum 0xfe30 (incorrect -> 0x8ec2), seq 2137792783, win 65535, options [mss 16344,nop,wscale 6,sackOK,TS val 34141359 ecr 0], length 0

rule 5..16777216/0(match): pass in on lo0: (tos 0x0, ttl 64, id 2459, offset 0, flags [DF], proto TCP (6), length 60, bad cksum 0 (->331f)!)
    127.0.0.1.54320 > 127.0.0.1.5556: Flags [S], cksum 0xfe30 (incorrect -> 0x8ec2), seq 2137792783, win 65535, options [mss 16344,nop,wscale 6,sackOK,TS val 34141359 ecr 0], length 0

rule 8..16777216/0(match): pass in on vtnet0: (tos 0x0, ttl 52, id 13349, offset 0, flags [DF], proto TCP (6), length 60)
    67.197.156.119.45484 > 132.148.77.28.80: Flags [S], cksum 0x9e57 (correct), seq 2310473985, win 29200, options [mss 1460,sackOK,TS val 64109860 ecr 0,nop,wscale 7], length 0

rule 8..16777216/0(match): pass in on vtnet0: (tos 0x0, ttl 52, id 32739, offset 0, flags [DF], proto TCP (6), length 60)
    67.197.156.119.45486 > 132.148.77.28.80: Flags [S], cksum 0x5b67 (correct), seq 2464105159, win 29200, options [mss 1460,sackOK,TS val 64109860 ecr 0,nop,wscale 7], length 0

rule 8..16777216/0(match): pass in on vtnet0: (tos 0x0, ttl 52, id 55497, offset 0, flags [DF], proto TCP (6), length 60)
    67.197.156.119.45488 > 132.148.77.28.80: Flags [S], cksum 0x832f (correct), seq 3354322413, win 29200, options [mss 1460,sackOK,TS val 64109860 ecr 0,nop,wscale 7], length 0

rule 9..16777216/0(match): pass out on lo0: (tos 0x0, ttl 64, id 2474, offset 0, flags [DF], proto TCP (6), length 60)
    127.0.0.1.60189 > 127.0.0.1.5556: Flags [S], cksum 0xfe30 (incorrect -> 0xe2c0), seq 1589892785, win 65535, options [mss 16344,nop,wscale 6,sackOK,TS val 34141386 ecr 0], length 0

rule 5..16777216/0(match): pass in on lo0: (tos 0x0, ttl 64, id 2474, offset 0, flags [DF], proto TCP (6), length 60, bad cksum 0 (->3310)!)
    127.0.0.1.60189 > 127.0.0.1.5556: Flags [S], cksum 0xfe30 (incorrect -> 0xe2c0), seq 1589892785, win 65535, options [mss 16344,nop,wscale 6,sackOK,TS val 34141386 ecr 0], length 0
10 packets captured
14 packets received by filter
0 packets dropped by kernel

I added newlines in between the packet info for better readability. You'll notice in the first dump where "rule 2" is blocking packets, which happens only in the very first request after the upstream server is started afresh.

What gives? You'll notice too that those blocked packets are on the loopback interface, but they coming from and going to the public ip address. I think they shouldn't be NAT'ed because they're not on the vtnet0 interface. Even still, why are subsequent requests taking a different path to facilitate communication from nginx to the upstream server?

The fix I found was to change the following line from my pf.conf: pass in log on $iface proto tcp to $public_ip port { ssh http } keep state to: pass in log proto tcp to $public_ip port { ssh http } keep state

I removed the on $iface part so the firewall will let pass even packets to $public_ip port 80 on the loopback interface. I also have mild concerns about the security implications of loosening that rule - which are mild only because I don't fully understand any longer how the traffic is operating between these different network interfaces.

At this point I'm not even sure if the issue is with my firewall or with nginx, just that when the firewall is disabled, the issue vanishes.

Note that nginx is proxy_pass'ing the requests in three different server blocks to http://127.0.0.1:[port] where port is one of 5555, 5556, or 5557.

papiro
  • 159
  • 1
  • 8
  • 1
    If you do a `netstat -rn` you will probably see that the public IP is routed to `lo0`. I do not use PF, but it is usual with IPFW to pass any traffic from `lo0` to `lo0` – Richard Smith Feb 11 '17 at 16:40
  • You're right: `132.148.77.28 link#1 UHS lo0 132.148.77.28/32 link#1 U vtnet0` but why are there two entries - one to lo0 and one to vtnet0? My main question remains though, why is only the first request suffering from having a different ip opened up on lo0 instead of the 127.0.0.1 which is specified in the `proxy_pass` directive of the nginx config? – papiro Feb 11 '17 at 16:51
  • sorry about the cramming of those two lines from `netstat -rn`. I couldn't figure out how to create newlines here. – papiro Feb 11 '17 at 16:56

1 Answers1

1

After much research, the correct fix for my issue, instead of modifying the line I mentioned, was to remove that line completely and to add a set skip on lo0 in the Options block of the pf.conf (after the Macros and before the Packet Filtering rules). This totally bars the loopback interface from being touched by pf.

As far as non 127.x.x.x ip's appearing on the loopback device, I have a theory. As @Richard Smith suggested, I ran netstat -rn and noticed the following two lines:

Destination      Gateway Flags Netif
132.148.77.28    link#1  UHS   lo0 
132.148.77.28/32 link#1  U     vtnet0

I did some digging and discovered that these two entries were added as an alias, via ifconfig, to the first ethernet interface listed in ifconfig's output: vtnet0. So, it naturally seems like traffic really could be routed through the loopback to 132.148.77.28 as demonstrated by this route. It just means it would use vtnet0 as the gateway to do it. Now, I'm not sure if what happened is that when it reached vtnet0 as the gateway, it got NAT'ed by the pf rule - but then I'm not sure why tcpdump would still show the old interface (lo0) from which it came. It still seems odd to me that the upstream server which is listening on 127.0.0.1 would choose to open a port on 132.148.77.28 to try to respond back to the proxy server the first time, and then on subsequent requests, just use 127.0.0.1 and have the reverse proxy "come down to its level" for the communication. My theory is that the routing algorithm "remembered" that the previous path was bad (ie getting blocked), and so tried a different approach. I won't mark my answer as the accepted answer because I really need a professional with a highly trained eye to verify if my lightweight theories have any validity.

papiro
  • 159
  • 1
  • 8