3

I have 2 servers connected through cisco linksys sg500. Both hosts have identical configuration (Ubuntu 14.04 LTS, 4 * 1Gb/s network interfaces with bonding).

When I set up bridge and interface bonding on both servers and ping first host from second or second from first everything works fairly.

But when I set up kvm with bridge and interface bonding and ping virtual host from any physical server I have 80% packet loss. Why?

$ ping 10.0.101.11
PING 10.0.101.11 (10.0.101.11) 56(84) bytes of data.
64 bytes from 10.0.101.11: icmp_seq=1 ttl=64 time=0.393 ms
64 bytes from 10.0.101.11: icmp_seq=7 ttl=64 time=0.219 ms
64 bytes from 10.0.101.11: icmp_seq=8 ttl=64 time=0.235 ms
64 bytes from 10.0.101.11: icmp_seq=9 ttl=64 time=0.228 ms
64 bytes from 10.0.101.11: icmp_seq=10 ttl=64 time=0.260 ms
64 bytes from 10.0.101.11: icmp_seq=11 ttl=64 time=0.285 ms
64 bytes from 10.0.101.11: icmp_seq=12 ttl=64 time=0.194 ms
64 bytes from 10.0.101.11: icmp_seq=13 ttl=64 time=0.212 ms
64 bytes from 10.0.101.11: icmp_seq=14 ttl=64 time=0.279 ms
64 bytes from 10.0.101.11: icmp_seq=15 ttl=64 time=0.227 ms
64 bytes from 10.0.101.11: icmp_seq=57 ttl=64 time=0.324 ms
^C
--- 10.0.101.11 ping statistics ---
57 packets transmitted, 11 received, 80% packet loss, time 55999ms
rtt min/avg/max/mdev = 0.194/0.259/0.393/0.058 ms

Logs, configs and other stuff here: http://pastebin.com/LXbmS2gp

Please help.

Andrew Schulman
  • 8,811
  • 21
  • 32
  • 47
supaplex
  • 31
  • 2

1 Answers1

3

bond-mode 0

This is the problem. Bridging doesn't support modes 0 and 6, In fact, for optimal performance, you should stick to either mode 1 or 4

dyasny
  • 18,802
  • 6
  • 49
  • 64
  • Or mode 5, which will rewrite source MAC to a host slave MAC when load balancing out a non-primary slave. – suprjami Nov 13 '14 at 11:04
  • or mode 2 or 3, they all work somehow. The ones that work best are 1 (because it's so simplistic) and 4 (because it will rely on hardware instead of trying to manipulate at the host level). Only 0 and 6 are heavily problematic, while 1 and 4 receive the most testing. – dyasny Nov 13 '14 at 14:51
  • I can't see any reason why any of 0 to 4 shouldn't work. Round robin (0) and Broadcast (3) probably aren't ideal for TCP, but still should work with the right EtherChannel config on switch, right? – suprjami Nov 14 '14 at 05:48
  • afaik there was some problem with arp monitoring or somesuch. https://access.redhat.com/solutions/661043 – dyasny Nov 14 '14 at 14:52