OpenNebula (KVM) + OpenvSwitch, high CPU load on high bandwidth use

Question

We are running an OpenNebula 5.0.2 environment on Ubuntu 16.04, using OpenvSwitch 2.5 for bridging the virtual interfaces and LACP trunking the two Gbit ports, which is working perfectly.

But when I run an iperf3 bandwidth test between a VM and its host, htop shows 100 % CPU load for qemu running that VM and iperf3 gets only 100-200 Mbps, even though there are no other high bandwidth-demanding VMs running. iperf3 between two VM hosts gets me almost full 1 Gbps and no CPU load.

I used to believe it was an OpenvSwitch issue back when we were still on 2.0.2, but now I think it's some virtual networking optimizations missing...

cbix · Accepted Answer · 2017-01-23T17:52:44.017

One huge optimization I could successfully (and easily, without exchanging the NIC etc.) apply was using the virtio model by default for all NICs in the VM template or for each NIC separately, as described here:

NIC_DEFAULT = [
  MODEL = "virtio" ]

For an already instantiated VM, shut it down, detach all NICs and reattach them with the "virtio" model.

In my first tests it increased iperf3 bandwidth to 5.6 Gbps between host and guest and decreased host CPU load to ~ 50-60 % per qemu thread during the test (< 5 % @ almost 1 Gbps running iperf3 client from a Gbit connected host).

If you know about further optimizations, feel free to add them!

score 0 · Answer 2 · answered Jan 23 '17 at 14:06

Anything that has to go through a virtual bridge is going to take a pretty big hit. This is true of ovs and linux bridging, since they both have to perform packet inspection in promiscuous mode to determine where things need to go (a layer 2 switch, essentially).

In high performance scenarios, such as with 10Gib ethernet, it is sometimes prudent to perform srv-io device pass-through rather than letting the host OS switch on layer 2. This comes with the drawback that only this one guest may use the passed ethernet card. PCI passthrough works extremely well for network cards, and KVM / libvirt excels in this.

Macvtap can also pass traffic directly to a guest VM with almost no overhead and without using srv-io PCI pass-through (so you don't have to dedicate hardware to a single VM). Macvtap is limited in that it can never provide host-to guest communication, or even guest-to-guest within the same hypervisor (since it uses the same MAC address of your host rather than using a different one for each guest over a virtual switch). One way to get around this is to perform "hairpinning" at the switch level (if your switch supports it), allowing a device to communicate with itself via a sort of loopback on a single port and single MAC address.

For host and guest intercommunication when using either of these methods I mentioned above, it is common to provide an additional bridged network just for that which isn't used for high performance communications. This is actually a very common configuration when using >=10Gib Ethernet on VMs.

OpenNebula (KVM) + OpenvSwitch, high CPU load on high bandwidth use

2 Answers2