0

I took over a Debian 7 server with an Intel NIC where the ports are bonded together for load balancing. This is the hardware:

lspci -vvv | egrep -i 'network|ethernet'
04:00.0 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01)
04:00.1 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01)
07:00.0 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01)
07:00.1 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01)

Firstly, what confuses me is that four entrances are shown, and the system shows eth0 - eth3 (four ports), even though the NIC only has two ports in the specs. However, only eth2 and eth3 are in fact up and running, hence two ports:

ip link show

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN mode DEFAULT 
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: eth0: <NO-CARRIER,BROADCAST,MULTICAST,SLAVE,UP> mtu 1500 qdisc mq master bond0 state DOWN mode DEFAULT qlen 1000
    link/ether 00:25:90:19:5c:e4 brd ff:ff:ff:ff:ff:ff
3: eth1: <NO-CARRIER,BROADCAST,MULTICAST,SLAVE,UP> mtu 1500 qdisc mq master bond0 state DOWN mode DEFAULT qlen 1000
    link/ether 00:25:90:19:5c:e7 brd ff:ff:ff:ff:ff:ff
4: eth2: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond0 state UP mode DEFAULT qlen 1000
    link/ether 00:25:90:19:5c:e6 brd ff:ff:ff:ff:ff:ff
5: eth3: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond0 state UP mode DEFAULT qlen 1000
    link/ether 00:25:90:19:5c:e5 brd ff:ff:ff:ff:ff:ff
6: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT 
    link/ether 00:25:90:19:5c:e6 brd ff:ff:ff:ff:ff:ff

The issue is that I get lower speeds than expected. When running two instances of iperf (one for each port), I only get combined speeds of 942 MBit/s, 471 MBit/s per port. I would expect more than that, since each port can do 1 Gbps! How come - is the bonding not configured for maximum performance?

[  3] local xx.xxx.xxx.xxx port 60868 connected with xx.xxx.xxx.xxx port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0-180.0 sec  9.87 GBytes   471 Mbits/sec
[  3] local xx.xxx.xxx.xxx port 49363 connected with xx.xxx.xxx.xxx port 5002
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0-180.0 sec  9.87 GBytes   471 Mbits/sec

Bonding configuration in /etc/network/interfaces:

auto bond0
iface bond0 inet static
    address xx.xxx.xxx.x
    netmask 255.255.255.0
    network xx.xxx.xxx.x
    broadcast xx.xxx.xxx.xxx
    gateway xx.xxx.xxx.x
    up /sbin/ifenslave bond0 eth0 eth1 eth2 eth3
    down /sbin/ifenslave -d bond0 eth0 eth1 eth2 eth3

Configured Bond mode is:

cat /proc/net/bonding/bond0

 Bonding Mode: transmit load balancing

Output of ifconfig:

bond0     Link encap:Ethernet  HWaddr 00:25:90:19:5c:e6  
          inet addr:xx.xxx.xxx.9  Bcast:xx.xxx.xxx.255  Mask:255.255.255.0
          inet6 addr: fe80::225:90ff:fe19:5ce6/64 Scope:Link
          UP BROADCAST RUNNING MASTER MULTICAST  MTU:1500  Metric:1
          RX packets:19136117104 errors:30 dropped:232491338 overruns:0 frame:15
          TX packets:19689527247 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:20530968684525 (18.6 TiB)  TX bytes:17678982525347 (16.0 TiB)

eth0      Link encap:Ethernet  HWaddr 00:25:90:19:5c:e4  
          UP BROADCAST SLAVE MULTICAST  MTU:1500  Metric:1
          RX packets:235903464 errors:0 dropped:0 overruns:0 frame:0
          TX packets:153535554 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:202899148983 (188.9 GiB)  TX bytes:173442571769 (161.5 GiB)
          Memory:fafe0000-fb000000 

eth1      Link encap:Ethernet  HWaddr 00:25:90:19:5c:e7  
          UP BROADCAST SLAVE MULTICAST  MTU:1500  Metric:1
          RX packets:3295412 errors:0 dropped:3276992 overruns:0 frame:0
          TX packets:152777329 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:213880307 (203.9 MiB)  TX bytes:172760941087 (160.8 GiB)
          Memory:faf60000-faf80000 

eth2      Link encap:Ethernet  HWaddr 00:25:90:19:5c:e6  
          UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
          RX packets:18667703388 errors:30 dropped:37 overruns:0 frame:15
          TX packets:9704053069 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:20314102256898 (18.4 TiB)  TX bytes:8672061985928 (7.8 TiB)
          Memory:faee0000-faf00000 

eth3      Link encap:Ethernet  HWaddr 00:25:90:19:5c:e5  
          UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
          RX packets:229214840 errors:0 dropped:229214309 overruns:0 frame:0
          TX packets:9679161295 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:13753398337 (12.8 GiB)  TX bytes:8660717026563 (7.8 TiB)
          Memory:fae60000-fae80000 

EDIT: I found the answer, thanks to the point made below. The system was running in bond-mode 5 (TLB), in order to get double speed it has to run in bond-mode 4 (IEEE 802.3ad Dynamic link aggregation). Thanks!

azren
  • 131
  • 3
  • Need more information. First, why are you using TLB and not 803ad? What is your workload, and what are you running iperf to? – Spooler Sep 13 '17 at 08:54
  • What is the difference between TLB and 803ad and how would I configure 803ad instead of TLB? I am very glad to post any further information if you let me know what exactly you need. I am running iperf to another server with the same hardware and bonding, except that the Bonding Mode is IEEE 802.3ad Dynamic link aggregation. Reverse speed test with iperf (from that server to this one) yields the same performance. – azren Sep 13 '17 at 09:03

1 Answers1

1

If you're only aware of two ports and have only cabled two ports then you should:

  1. Figure out what's going on with the other two ports you don't currently see. Maybe they're integrated in the server motherboard, or even cabled to something you don't expect.
  2. Only bond together in your software networking configuration what you've physically connected to the same switch.

After you've taken care of that, you can more adequately find the information you need to identify performance issues.

Spooler
  • 7,046
  • 18
  • 29