0

I've a KVM system upon which I'm running a network bridge directly between all VM's and a bond0 (eth0, eth1) on the host OS. As such, all machines are presented on the same subnet, available outside of the box. The bond is doing mode 1 active / passive, with an arp_ip_target set to the default gateway, which has caused some issues in itself, but I can't see the bond configs mattering here myself.

I'm seeing odd things most times when I stop and start a guest on the platform, in that on the host I lose network connectivity (icmp, ssh) for about 30 seconds. I don't lose connectivity on the other already running VM's though... they can always ping the default GW, but the host can't. I say "about 30 seconds" but from some tests it actually seems to be 28 seconds usually (or at least, I lose 28 pings...) and I'm wondering if this somehow relates to the bridge config.

I'm not running STP on the bridge at all, and the forwarding delay is set to 1 second, path cost on the bond0 lowered to 10 and port priority of bond0 also lowered to 1. As such I don't think that the bridge should ever be able to think that bond0 is not connected just fine (as continued guest connectivity implies) yet the IP of the host, which is on the bridge device (... could that matter?? ) becomes unreachable.

I'm fairly sure it's about the bridged networking, but at the same time as this happens when a VM is started there are clearly loads of other things also happening so maybe I'm way off the mark.

Lack of connectivity:

# ping 10.20.11.254                                          
PING 10.20.11.254 (10.20.11.254) 56(84) bytes of data.                          
64 bytes from 10.20.11.254: icmp_seq=1 ttl=255 time=0.921 ms                    
64 bytes from 10.20.11.254: icmp_seq=2 ttl=255 time=0.541 ms                    
type=1700 audit(1293462808.589:325): dev=vnet6 prom=256 old_prom=0 auid=42949672
95 ses=4294967295                                                               
type=1700 audit(1293462808.604:326): dev=vnet7 prom=256 old_prom=0 auid=42949672
95 ses=4294967295                                                               
type=1700 audit(1293462808.618:327): dev=vnet8 prom=256 old_prom=0 auid=42949672
95 ses=4294967295                                                               
kvm: 14116: cpu0 unimplemented perfctr wrmsr: 0x186 data 0x130079               
kvm: 14116: cpu0 unimplemented perfctr wrmsr: 0xc1 data 0xffdd694a              
kvm: 14116: cpu0 unimplemented perfctr wrmsr: 0x186 data 0x530079               
64 bytes from 10.20.11.254: icmp_seq=30 ttl=255 time=0.514 ms                   
64 bytes from 10.20.11.254: icmp_seq=31 ttl=255 time=0.551 ms                   
64 bytes from 10.20.11.254: icmp_seq=32 ttl=255 time=0.437 ms                   
64 bytes from 10.20.11.254: icmp_seq=33 ttl=255 time=0.392 ms 

brctl output of relevant bridge:

# brctl showstp brdev
brdev
 bridge id      8000.b2e1378d1396
 designated root    8000.b2e1378d1396
 root port         0            path cost          0
 max age          19.99         bridge max age        19.99
 hello time        1.99         bridge hello time      1.99
 forward delay         0.99         bridge forward delay       0.99
 ageing time         299.95
 hello timer           0.50         tcn timer          0.00
 topology change timer     0.00         gc timer           0.04
 flags          


vnet5 (3)
 port id        8003            state            forwarding
 designated root    8000.b2e1378d1396   path cost        100
 designated bridge  8000.b2e1378d1396   message age timer      0.00
 designated port    8003            forward delay timer    0.00
 designated cost       0            hold timer         0.00
 flags          

vnet0 (2)
 port id        8002            state            forwarding
 designated root    8000.b2e1378d1396   path cost        100
 designated bridge  8000.b2e1378d1396   message age timer      0.00
 designated port    8002            forward delay timer    0.00
 designated cost       0            hold timer         0.00
 flags          

bond0 (1)
 port id        0001            state            forwarding
 designated root    8000.b2e1378d1396   path cost         10
 designated bridge  8000.b2e1378d1396   message age timer      0.00
 designated port    0001            forward delay timer    0.00
 designated cost       0            hold timer         0.00
 flags          

I do see the new port listed as learning, but in line with the forward delay, only for 1 or 2 seconds when polling the brctl output on a loop.

ifconfig without sample VM:

bond0     Link encap:Ethernet  HWaddr D4:85:64:65:FA:4E  
          inet6 addr: fe80::d685:64ff:fe65:fa4e/64 Scope:Link
          UP BROADCAST RUNNING MASTER MULTICAST  MTU:1500  Metric:1
          RX packets:21168629 errors:0 dropped:0 overruns:0 frame:0
          TX packets:9280285 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:8777768179 (8.1 GiB)  TX bytes:2671736365 (2.4 GiB)

bradSP1   Link encap:Ethernet  HWaddr 00:00:00:00:00:00  
          inet6 addr: fe80::200:ff:fe00:0/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:36 errors:0 dropped:0 overruns:0 frame:0
          TX packets:26 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:1656 (1.6 KiB)  TX bytes:6592 (6.4 KiB)

brawSP1   Link encap:Ethernet  HWaddr 00:00:00:00:00:00  
          inet6 addr: fe80::200:ff:fe00:0/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:109 errors:0 dropped:0 overruns:0 frame:0
          TX packets:26 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:4996 (4.8 KiB)  TX bytes:6592 (6.4 KiB)

brdev     Link encap:Ethernet  HWaddr B2:E1:37:8D:13:96  
          inet addr:10.20.11.129  Bcast:10.20.11.255  Mask:255.255.255.0
          inet6 addr: fe80::d685:64ff:fe65:fa4e/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:16663718 errors:0 dropped:0 overruns:0 frame:0
          TX packets:8800468 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:3268513274 (3.0 GiB)  TX bytes:2587834869 (2.4 GiB)

brmgtSP1  Link encap:Ethernet  HWaddr 1A:CA:AE:08:1C:42  
          inet6 addr: fe80::200:ff:fe00:0/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:699322 errors:0 dropped:0 overruns:0 frame:0
          TX packets:26 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:928721301 (885.6 MiB)  TX bytes:6706 (6.5 KiB)

eth0      Link encap:Ethernet  HWaddr D4:85:64:65:FA:4E  
          UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
          RX packets:20412120 errors:0 dropped:0 overruns:0 frame:0
          TX packets:9280285 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:8720799421 (8.1 GiB)  TX bytes:2671736365 (2.4 GiB)
          Interrupt:169 Memory:f4000000-f4012800 

eth1      Link encap:Ethernet  HWaddr D4:85:64:65:FA:4E  
          UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
          RX packets:756509 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:56968758 (54.3 MiB)  TX bytes:0 (0.0 b)
          Interrupt:186 Memory:f2000000-f2012800 

lo        Link encap:Local Loopback  
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:3937 errors:0 dropped:0 overruns:0 frame:0
          TX packets:3937 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:6641553 (6.3 MiB)  TX bytes:6641553 (6.3 MiB)

vnet0     Link encap:Ethernet  HWaddr B2:E1:37:8D:13:96  
          inet6 addr: fe80::b0e1:37ff:fe8d:1396/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:59861 errors:0 dropped:0 overruns:0 frame:0
          TX packets:5924530 errors:0 dropped:0 overruns:2 carrier:0
          collisions:0 txqueuelen:500 
          RX bytes:6405635 (6.1 MiB)  TX bytes:1987480170 (1.8 GiB)

vnet1     Link encap:Ethernet  HWaddr 1A:CA:AE:08:1C:42  
          inet6 addr: fe80::18ca:aeff:fe08:1c42/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:541798 errors:0 dropped:0 overruns:0 frame:0
          TX packets:61998 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:500 
          RX bytes:802746110 (765.5 MiB)  TX bytes:6498514 (6.1 MiB)

ifconfig with sample VM:

bond0     Link encap:Ethernet  HWaddr D4:85:64:65:FA:4E  
          inet6 addr: fe80::d685:64ff:fe65:fa4e/64 Scope:Link
          UP BROADCAST RUNNING MASTER MULTICAST  MTU:1500  Metric:1
          RX packets:21285120 errors:0 dropped:0 overruns:0 frame:0
          TX packets:9291457 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:8948482155 (8.3 GiB)  TX bytes:2673235824 (2.4 GiB)

bradSP1   Link encap:Ethernet  HWaddr 2A:18:E1:2D:1A:EC  
          inet6 addr: fe80::200:ff:fe00:0/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:36 errors:0 dropped:0 overruns:0 frame:0
          TX packets:26 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:1656 (1.6 KiB)  TX bytes:6592 (6.4 KiB)

brawSP1   Link encap:Ethernet  HWaddr 96:55:AA:14:67:07  
          inet6 addr: fe80::200:ff:fe00:0/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:109 errors:0 dropped:0 overruns:0 frame:0
          TX packets:26 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:4996 (4.8 KiB)  TX bytes:6592 (6.4 KiB)

brdev     Link encap:Ethernet  HWaddr 16:5C:BC:E5:90:11  
          inet addr:10.20.11.129  Bcast:10.20.11.255  Mask:255.255.255.0
          inet6 addr: fe80::d685:64ff:fe65:fa4e/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:16673094 errors:0 dropped:0 overruns:0 frame:0
          TX packets:8801611 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:3279365967 (3.0 GiB)  TX bytes:2587927761 (2.4 GiB)

brmgtSP1  Link encap:Ethernet  HWaddr 1A:CA:AE:08:1C:42  
          inet6 addr: fe80::200:ff:fe00:0/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:699342 errors:0 dropped:0 overruns:0 frame:0
          TX packets:26 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:928723605 (885.6 MiB)  TX bytes:6706 (6.5 KiB)

eth0      Link encap:Ethernet  HWaddr D4:85:64:65:FA:4E  
          UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
          RX packets:20528382 errors:0 dropped:0 overruns:0 frame:0
          TX packets:9291457 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:8891497316 (8.2 GiB)  TX bytes:2673235824 (2.4 GiB)
          Interrupt:169 Memory:f4000000-f4012800 

eth1      Link encap:Ethernet  HWaddr D4:85:64:65:FA:4E  
          UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
          RX packets:756738 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:56984839 (54.3 MiB)  TX bytes:0 (0.0 b)
          Interrupt:186 Memory:f2000000-f2012800 

lo        Link encap:Local Loopback  
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:3937 errors:0 dropped:0 overruns:0 frame:0
          TX packets:3937 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:6641553 (6.3 MiB)  TX bytes:6641553 (6.3 MiB)

vnet0     Link encap:Ethernet  HWaddr B2:E1:37:8D:13:96  
          inet6 addr: fe80::b0e1:37ff:fe8d:1396/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:69818 errors:0 dropped:0 overruns:0 frame:0
          TX packets:6034715 errors:0 dropped:0 overruns:2 carrier:0
          collisions:0 txqueuelen:500 
          RX bytes:7763947 (7.4 MiB)  TX bytes:2149238089 (2.0 GiB)

vnet1     Link encap:Ethernet  HWaddr 1A:CA:AE:08:1C:42  
          inet6 addr: fe80::18ca:aeff:fe08:1c42/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:650557 errors:0 dropped:0 overruns:0 frame:0
          TX packets:72519 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:500 
          RX bytes:964153780 (919.4 MiB)  TX bytes:7896728 (7.5 MiB)

vnet2     Link encap:Ethernet  HWaddr AA:4B:22:76:D2:EC  
          inet6 addr: fe80::a84b:22ff:fe76:d2ec/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:10521 errors:0 dropped:0 overruns:0 frame:0
          TX packets:108765 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:500 
          RX bytes:1398214 (1.3 MiB)  TX bytes:161408138 (153.9 MiB)

vnet3     Link encap:Ethernet  HWaddr 96:55:AA:14:67:07  
          inet6 addr: fe80::9455:aaff:fe14:6707/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:6 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:500 
          RX bytes:0 (0.0 b)  TX bytes:468 (468.0 b)

vnet4     Link encap:Ethernet  HWaddr 2A:18:E1:2D:1A:EC  
          inet6 addr: fe80::2818:e1ff:fe2d:1aec/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:6 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:500 
          RX bytes:0 (0.0 b)  TX bytes:468 (468.0 b)

vnet5     Link encap:Ethernet  HWaddr 16:5C:BC:E5:90:11  
          inet6 addr: fe80::145c:bcff:fee5:9011/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:241 errors:0 dropped:0 overruns:1 carrier:0
          collisions:0 txqueuelen:500 
          RX bytes:0 (0.0 b)  TX bytes:47167 (46.0 KiB)

All pointers, tips or stabs in the dark appreciated.

Chris Phillips
  • 254
  • 4
  • 15
  • Putting an IP on the bridge device shouldn't matter. I'd try running tcpdump on brdev to see if ARPs aren't being generated or aren't receiving replies. Also try looking at `watch -n 1 brctl showmacs` to see what MACs are listed. – Mark Wagner Dec 27 '10 at 21:01

2 Answers2

0

Can you please post ifconfig -a from before and after you start the VM?

dyasny
  • 18,802
  • 6
  • 49
  • 64
  • updated with ifconfig's, thanks. I'm not sure what use these are directly, as outside of this time of transition, bother setups appear to work fine. – Chris Phillips Dec 27 '10 at 16:53
  • still not getting the full picture here. you have eth0 HWaddr D4:85:64:65:FA:4E and eth1 HWaddr D4:85:64:65:FA:4E. both bonded into bond0 HWaddr D4:85:64:65:FA:4E (note all MACs are the same here? not healthy!) Now, the bridge on top of those should have the same MAC normally - it doesn't. What I normally do is set up a bridge manually, not using libvirt's built-in stuff, and use it for the VMs. When a VM starts, it gets assigned a MAC from the 54:52:xxx range. If the MAX is lower value than the current bridge MAC (the one taken from the bond or eth), the bridge will take the lower mac – dyasny Dec 28 '10 at 09:45
  • and with macs changing on the fly like this, you will have a temporary outage until the switches re-learn the new arp tables. In recent versions of libvirt this is addressed by assigning VMs with MACs from the FE:xx:xx... range, so they are always higher than whatever the bridge has from the eth/bond, and the bridge mac doesn't get replaced. assuming brdev is the bridge you use here, note in the first ifconfig how your MAC is B2:E1:37:8D:13:96 while after starting some VMs it is 16:5C:BC:E5:90:11.
    solution - upgrade to latest libvirt and set up a bridge normally.
    – dyasny Dec 28 '10 at 09:52
  • btw, what's the distro you are using there? – dyasny Dec 28 '10 at 09:53
  • What use are they?? The mac of the bridge has changed!! Duh! It's not until we arp again for the gateway address that the gateway gets our new mac and it's echo replies are accepted by us – Chris Phillips Dec 28 '10 at 11:23
  • Hmm, didn't refresh to see those comments. I'm using centos, with all updates, and can't use internal libvirt bridges, or don't want to at least, as there *seems* to be no way to make them bridging to the external nic, just isolated or natted, and the STP configs for them is horrible via virsh etc. I am in an odd situation here though as this is a small dev server with very limited external networking. When this is production the bridge would NOT have an IP, as host and guests would be on a different vlan etc. So actually this will magically disappear with a proper network connected. Thanks. – Chris Phillips Dec 28 '10 at 11:28
  • As far as the bond and the eth's sharing MAC's that's the way it works with the bond module, nothing weird being done, it just picks a mac of one of the ethernet ports. I was thinking that as I'm 1) bridging and 2) doing an active / passive bond I could actually do STP properly and drop the bond altogether and let STP handle the loop on the eths. Also I was looking at using a tap interface for the host IP of of the bridge, but couldn't suss out how to create one (probably as it's so simple it's not worth documenting it much...) – Chris Phillips Dec 28 '10 at 11:30
  • http://fpaste.org/LYwW/ Here's a snip of my setup: – dyasny Dec 28 '10 at 12:37
  • The blocker I seem to have is making the mac of the bridge static. I saw some talk about just using ifconfig hw ether to set it to a permanent device, e.g. bond0 in my case, and whilst it took and worked for a while, it is changing dynamically again after a period of time. – Chris Phillips Dec 28 '10 at 13:13
  • ... so therefore I can set a deliberately low mac on the bond0 interface and then that will always be used. Seems like a goer... – Chris Phillips Dec 28 '10 at 13:36
  • As I said - this is fixed in the recent versions on libvirt (since about 6 month iirc), so all you need to do is update - it will rewrite the MAC of the tap device used by the VM before attaching it to the bridge – dyasny Dec 28 '10 at 14:11
  • Right yes I see, all clear. Well when CentOS upgrades we'll pick that up. Until then though, as I have to create a bond interface within cobbler anyway, I can just set a mac there fairly trivially. – Chris Phillips Dec 28 '10 at 14:54
  • glad I could help :) – dyasny Dec 28 '10 at 18:54
0

I'm running almost the same setup and having the same issues (also not using STP). When I'm running more VM's the issues are pretty much gone. The host stays available. Can you try to ping you're gateway from the host in the background and then try to reproduce the the issue by stopping/destroying a VM? This seems to work for me. I think that if the host or a VM on the host sends out signals on the same vlan or bridge triggers something to keep the host itself available.

Erwin
  • 1
  • In the ping command in the original post that's what I'm doing. I lose 28 echo requests at the same point as, as per the output, the VM comes online and the vcpus are seen to be created / messed with. I'm yet to working out if this actually represents a problem or not, either way it's very undesirable. – Chris Phillips Dec 28 '10 at 09:47
  • As above, the bridge mac is changing, so arp entries are invalid for a period of time. – Chris Phillips Dec 28 '10 at 11:32