0

I have very interesting problem:

I have Proxmox hypervisor and two linux vms on it:

  • First vm have several nics in main bridge, each nic added to vm with certain vlan tag on hypervisor.
  • Second vm have only one nic in main bridge, but it have vlan-interfaces within vm.

Network settings are identical, but on second vm something TCP-Handshake is not work. On the other hand ICMP and UDP protocols are works fine.

The problem occurs in all directions of traffic relating the second machine:

  • from second vm to external world.
  • from second vm to router. (and reverse)
  • from second vm to first vm. (and reverse)

How I tested it?

  • ping: works fine
  • nslookup (udp mode): works fine
  • nslookup (tcp mode): timeout error
  • telnet or ssh: timeout error

Then I decided to capture traffic and analyze it in wireshark:

I've seen the same problem everywhere: SYN and SYN/ACK packets from the second virtual machine is not recognized. It looks like this pakets are not comes, but they are comes perfectly. (see below)

I show 4 capture here, when I tried to connect on 80 port on vm via telnet.

  • Router: 192.168.32.1
  • First vm: 192.168.32.70
  • Second vm: 192.168.32.80

Successful connecting to first vm (capture from client):

No.     Time        Source                Destination           Protocol Length Info
      2 1.311927    192.168.32.1          192.168.32.70         TCP      74     38873→80 [SYN] Seq=0 Win=13840 Len=0 MSS=1384 SACK_PERM=1 TSval=54904054 TSecr=0 WS=32
      3 1.347181    192.168.32.70         192.168.32.1          TCP      74     80→38873 [SYN, ACK] Seq=0 Ack=1 Win=28960 Len=0 MSS=1384 SACK_PERM=1 TSval=57170781 TSecr=54904054 WS=128
      4 1.347223    192.168.32.1          192.168.32.70         TCP      66     38873→80 [ACK] Seq=1 Ack=1 Win=13856 Len=0 TSval=54904058 TSecr=57170781

Successful connecting to first vm (capture from server):

No.     Time        Source                Destination           Protocol Length Info
      1 0.000000    192.168.32.1          192.168.32.70         TCP      74     38873→80 [SYN] Seq=0 Win=13840 Len=0 MSS=1384 SACK_PERM=1 TSval=54904054 TSecr=0 WS=32
      2 0.000128    192.168.32.70         192.168.32.1          TCP      74     80→38873 [SYN, ACK] Seq=0 Ack=1 Win=28960 Len=0 MSS=1460 SACK_PERM=1 TSval=57170781 TSecr=54904054 WS=128
      3 0.051272    192.168.32.1          192.168.32.70         TCP      66     38873→80 [ACK] Seq=1 Ack=1 Win=13856 Len=0 TSval=54904058 TSecr=57170781

Unsuccessful connecting to second vm (capture from client):

No.     Time        Source                Destination           Protocol Length Info
     25 0.889659    192.168.32.1          192.168.32.80         TCP      74     37740→80 [SYN] Seq=0 Win=13840 Len=0 MSS=1384 SACK_PERM=1 TSval=54864760 TSecr=0 WS=32
     27 0.925075    192.168.32.80         192.168.32.1          TCP      74     80→37740 [SYN, ACK] Seq=0 Ack=1 Win=28960 Len=0 MSS=1384 SACK_PERM=1 TSval=210548 TSecr=54864760 WS=128
     34 1.880028    192.168.32.1          192.168.32.80         TCP      74     [TCP Spurious Retransmission] 37740→80 [SYN] Seq=0 Win=13840 Len=0 MSS=1384 SACK_PERM=1 TSval=54864860 TSecr=0 WS=32
     35 1.915204    192.168.32.80         192.168.32.1          TCP      74     [TCP Retransmission] 80→37740 [SYN, ACK] Seq=4294967049 Ack=1 Win=28960 Len=0 MSS=1384 SACK_PERM=1 TSval=210795 TSecr=54864760 WS=128
     51 2.912418    192.168.32.80         192.168.32.1          TCP      74     [TCP Retransmission] 80→37740 [SYN, ACK] Seq=4294966799 Ack=1 Win=28960 Len=0 MSS=1384 SACK_PERM=1 TSval=211045 TSecr=54864760 WS=128
     63 3.880067    192.168.32.1          192.168.32.80         TCP      74     [TCP Spurious Retransmission] 37740→80 [SYN] Seq=0 Win=13840 Len=0 MSS=1384 SACK_PERM=1 TSval=54865060 TSecr=0 WS=32
     64 3.917480    192.168.32.80         192.168.32.1          TCP      74     [TCP Retransmission] 80→37740 [SYN, ACK] Seq=4294966549 Ack=1 Win=28960 Len=0 MSS=1384 SACK_PERM=1 TSval=211295 TSecr=54864760 WS=128
     67 5.912529    192.168.32.80         192.168.32.1          TCP      74     [TCP Retransmission] 80→37740 [SYN, ACK] Seq=4294966049 Ack=1 Win=28960 Len=0 MSS=1384 SACK_PERM=1 TSval=211795 TSecr=54864760 WS=128
     73 7.890030    192.168.32.1          192.168.32.80         TCP      74     [TCP Spurious Retransmission] 37740→80 [SYN] Seq=0 Win=13840 Len=0 MSS=1384 SACK_PERM=1 TSval=54865461 TSecr=0 WS=32
     74 7.925401    192.168.32.80         192.168.32.1          TCP      74     [TCP Retransmission] 80→37740 [SYN, ACK] Seq=4294965546 Ack=1 Win=28960 Len=0 MSS=1384 SACK_PERM=1 TSval=212298 TSecr=54864760 WS=128

Unsuccessful connecting to second vm (capture from server):

No.     Time        Source                Destination           Protocol Length Info
      1 0.000000    192.168.32.1          192.168.32.80         TCP      74     37740→80 [SYN] Seq=0 Win=13840 Len=0 MSS=1384 SACK_PERM=1 TSval=54864760 TSecr=0 WS=32
      2 0.000105    192.168.32.80         192.168.32.1          TCP      74     80→37740 [SYN, ACK] Seq=0 Ack=1 Win=28960 Len=0 MSS=1460 SACK_PERM=1 TSval=210548 TSecr=54864760 WS=128
      3 0.990176    192.168.32.1          192.168.32.80         TCP      74     [TCP Spurious Retransmission] 37740→80 [SYN] Seq=0 Win=13840 Len=0 MSS=1384 SACK_PERM=1 TSval=54864860 TSecr=0 WS=32
      4 0.990240    192.168.32.80         192.168.32.1          TCP      74     [TCP Retransmission] 80→37740 [SYN, ACK] Seq=0 Ack=1 Win=28960 Len=0 MSS=1460 SACK_PERM=1 TSval=210795 TSecr=54864760 WS=128
      5 1.987305    192.168.32.80         192.168.32.1          TCP      74     [TCP Retransmission] 80→37740 [SYN, ACK] Seq=0 Ack=1 Win=28960 Len=0 MSS=1460 SACK_PERM=1 TSval=211045 TSecr=54864760 WS=128
      6 2.991251    192.168.32.1          192.168.32.80         TCP      74     [TCP Spurious Retransmission] 37740→80 [SYN] Seq=0 Win=13840 Len=0 MSS=1384 SACK_PERM=1 TSval=54865060 TSecr=0 WS=32
      7 2.991317    192.168.32.80         192.168.32.1          TCP      74     [TCP Retransmission] 80→37740 [SYN, ACK] Seq=0 Ack=1 Win=28960 Len=0 MSS=1460 SACK_PERM=1 TSval=211295 TSecr=54864760 WS=128
      8 4.987338    192.168.32.80         192.168.32.1          TCP      74     [TCP Retransmission] 80→37740 [SYN, ACK] Seq=0 Ack=1 Win=28960 Len=0 MSS=1460 SACK_PERM=1 TSval=211795 TSecr=54864760 WS=128
     11 7.000116    192.168.32.1          192.168.32.80         TCP      74     [TCP Spurious Retransmission] 37740→80 [SYN] Seq=0 Win=13840 Len=0 MSS=1384 SACK_PERM=1 TSval=54865461 TSecr=0 WS=32
     12 7.000184    192.168.32.80         192.168.32.1          TCP      74     [TCP Retransmission] 80→37740 [SYN, ACK] Seq=0 Ack=1 Win=28960 Len=0 MSS=1460 SACK_PERM=1 TSval=212298 TSecr=54864760 WS=128

I am not understand why it's can't work. Any ideas?

At the MarkoPolo's request I add the output of commands from second vm:

ifconfig

ens18     Link encap:Ethernet  HWaddr 12:7c:7f:a1:8a:b4  
          inet addr:10.10.100.80  Bcast:0.0.0.0  Mask:255.255.255.0
          inet6 addr: fe80::107c:7fff:fea1:8ab4/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:185759 errors:0 dropped:27 overruns:0 frame:0
          TX packets:1186 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:23593488 (23.5 MB)  TX bytes:173539 (173.5 KB)

ens18.32  Link encap:Ethernet  HWaddr 12:7c:7f:a1:8a:b4  
          inet addr:192.168.32.80  Bcast:0.0.0.0  Mask:255.255.255.0
          inet6 addr: fe80::107c:7fff:fea1:8ab4/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:1821 errors:0 dropped:0 overruns:0 frame:0
          TX packets:52 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:195803 (195.8 KB)  TX bytes:3718 (3.7 KB)

lo        Link encap:Local Loopback  
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:65536  Metric:1
          RX packets:160 errors:0 dropped:0 overruns:0 frame:0
          TX packets:160 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1 
          RX bytes:11840 (11.8 KB)  TX bytes:11840 (11.8 KB)

route -n

Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
0.0.0.0         10.10.100.1     0.0.0.0         UG    0      0        0 ens18
10.10.100.0     0.0.0.0         255.255.255.0   U     0      0        0 ens18
192.168.32.0    0.0.0.0         255.255.255.0   U     0      0        0 ens18.32
kvaps
  • 253
  • 3
  • 9
  • So the packet capture from the client side shows the SYN/ACK from the server actually being received so the final ACK should be sent to complete the TCP handshake. This isn't happening so I'm assuming that the Kernel is dropping the packet. Perhaps a firewall rule is dropping the return traffic or maybe [Reverse Path Filtering](http://tldp.org/HOWTO/Adv-Routing-HOWTO/lartc.kernel.rpf.html) is dropping the packet because it is arriving on the "wrong" network interface. Can you add the output of `ifconfig` and `route -n` to your original post? – Mark Riddell Dec 14 '16 at 16:07
  • Hi, I added output of commands from server. I do not see any reason to adding the client configuration, since it normally works with the first server. – kvaps Dec 14 '16 at 17:35
  • I need to say that firewall is disabled. Also I made capture on all interfaces. Result same. The client doesn't want to answer on `SYN / ACK`, as well as in the opposite case: When I init connection from vm to external server, it's just ignore `SYN` packet from vm. – kvaps Dec 14 '16 at 17:45

1 Answers1

1

I resolved this issue by adding ens18 to bridge br0 and create vlan interface on bridge br0.32.

It seems that it is ubuntu kernel bug, I tested this issue on arch linux iso and it's work properly. I will send a bug report on launchpad...

kvaps
  • 253
  • 3
  • 9