2

(the same question was asked on http://unix.stackexchange.com)

We are facing some problem with configuration of servers: There are 2 servers, each one contain 2 NICs which are in bond. Each server is connected to 2 Cisco switches (one connection from each NIC to separate switch). The port on the switch is configured to have VLAN 1111. There is also interconnection between switches (this VLAN is part of the interconnection).

There is following error on the switch2 (Gi0/25 is where server is connected, Gi0/30 is the interconnection)

*Jun  1 16:18:23.182: %SW_MATM-4-MACFLAP_NOTIF: Host 1cc1.de7a.04b6 in vlan 1111 is flapping between port Gi0/25 and port Gi0/30 
*Jun  1 16:18:45.093: %SW_MATM-4-MACFLAP_NOTIF: Host 1cc1.de7a.04b6 in vlan 1111 is flapping between port Gi0/30 and port Gi0/25 
*Jun  1 16:18:56.031: %SW_MATM-4-MACFLAP_NOTIF: Host 1cc1.de7a.04b6 in vlan 1111 is flapping between port Gi0/25 and port Gi0/30 
*Jun  1 16:19:15.141: %SW_MATM-4-MACFLAP_NOTIF: Host 1cc1.de7a.04b6 in vlan 1111 is flapping between port Gi0/25 and port Gi0/30 
*Jun  1 16:19:23.479: %SW_MATM-4-MACFLAP_NOTIF: Host 1cc1.de7a.04b6 in vlan 1111 is flapping between port Gi0/30 and port Gi0/25 
*Jun  1 16:19:45.616: %SW_MATM-4-MACFLAP_NOTIF: Host 1cc1.de7a.04b6 in vlan 1111 is flapping between port Gi0/30 and port Gi0/25 

when checking configuration you can see that both switches has learned 1cc1.de7a.04b6 MAC address

NLS-PDC-SW2>show mac address-table vlan 1111 
          Mac Address Table
-------------------------------------------
Vlan    Mac Address       Type        Ports
----    -----------       --------    -----
1111    1cc1.de7a.046a    DYNAMIC     Gi0/26
1111    1cc1.de7a.04b6    DYNAMIC     Gi0/25
Total Mac Addresses for this criterion: 23

NLS-PDC-SW1>show mac address-table vlan 1111 
          Mac Address Table
-------------------------------------------
Vlan    Mac Address       Type        Ports
----    -----------       --------    -----
1111    1cc1.de7a.04b6    DYNAMIC     Gi0/25
Total Mac Addresses for this criterion: 24
NLS-PDC-SW1>

checking the modprobe file from both servers I've found that on server2 (which contain 1cc1.de7a.04b6)

alias bond0 bonding
options bond0 miimon=100

and on server1 (which contain 1cc1.de7a.046a)

alias bond0 bonding
options bond0 miimon=100 mode=1

I'm really confused with the needed configuration. Can you please suggest?

EDIT

[admin@servera ~]$ cat /proc/net/bonding/bond0 
Ethernet Channel Bonding Driver: v3.4.0 (October 7, 2008)

Bonding Mode: fault-tolerance (active-backup)
Primary Slave: None
Currently Active Slave: eth1
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 0
Down Delay (ms): 0

Slave Interface: eth1
MII Status: up
Link Failure Count: 0
Permanent HW addr: 1c:c1:de:7a:04:6a

Slave Interface: eth3
MII Status: up
Link Failure Count: 1
Permanent HW addr: 98:4b:e1:0a:cb:20


[admin@serverb ~]$ cat /proc/net/bonding/bond0 
Ethernet Channel Bonding Driver: v3.4.0 (October 7, 2008)

Bonding Mode: load balancing (round-robin)
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 0
Down Delay (ms): 0

Slave Interface: eth1
MII Status: up
Link Failure Count: 0
Permanent HW addr: 1c:c1:de:7a:04:b6

Slave Interface: eth3
MII Status: up
Link Failure Count: 1
Permanent HW addr: 98:4b:e1:01:49:ba
user1977050
  • 155
  • 6
  • Could you show output of the # cat /proc/net/bonding/bond0 from both servers? – ALex_hha Jul 31 '13 at 13:04
  • **server A:** DEVICE=bond0 BOOTPROTO=none ONBOOT=yes NETMASK=255.255.255.240 IPADDR=10.201.39.152 USERCTL=no GATEWAY=10.201.39.145 TYPE=BOND **server B:** DEVICE=bond0 BOOTPROTO=none ONBOOT=yes NETMASK=255.255.255.240 IPADDR=10.201.39.150 USERCTL=no GATEWAY=10.201.39.145 TYPE=BOND – user1977050 Jul 31 '13 at 13:12
  • it's not what I'm asked you ;) And add such info with formatting to the question body – ALex_hha Jul 31 '13 at 13:23

2 Answers2

1

The ports in the round-robin (mode 0) bond need to be in an EtherChannel.

Read the bonding documentation, Chapter 5 Switch Configuration:

https://www.kernel.org/doc/Documentation/networking/bonding.txt

Also, modprobe is not the correct place to configure bonding options, you should be using BONDING_OPTS="miimon=100 mode=X" in /etc/sysconfig/network-scripts/ifcfg-bondX

suprjami
  • 3,536
  • 21
  • 29
0

I think the root of the issue is that you are using round-robin mode on one of you server. Try to change round-robin (mode=0) to active-backup (mode=1)

ALex_hha
  • 7,193
  • 1
  • 25
  • 40
  • after the change in modprobe file, should one restart the network? – user1977050 Aug 01 '13 at 06:28
  • yes, you should, if you could restart the whole server, it would be better – ALex_hha Aug 01 '13 at 07:19
  • Whilst this is a workaround, the root cause of the problem is incorrect switch configuration, as per my answer. – suprjami Nov 17 '14 at 21:51
  • Active-Backup doesn't need switch configuration. Since only one port is active at one time, the switches don't have to be aware via EtherChannel. The correct solution is either (Round Robin + EtherChannel) OR (Active-Passive + No EtherChannel). Do be sure to remove your downvote. – Christopher Karel Nov 17 '14 at 21:59