3

I have setup a LACP Bond on 2 x 1Gbps connections on a HP server running Debian 8.x, previously done this configuration on CentOS 7.x with no issues at all.

The issue I am facing is eth0 a minute after the OS booting goes into a churned state, once the "monitoring" stage has completed.

Actor Churn State: churned
Partner Churn State: churned

I have done reading online and can't seem to find much about what can cause this, I have had the DC check the switch configuration and is identical to a working CentOS setup.

I have attached the network configuration file below, the connection works however only uses eth1, so removes the benefits of a bond.

cat /proc/net/bonding/bond0
Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)

Bonding Mode: IEEE 802.3ad Dynamic link aggregation
Transmit Hash Policy: layer2 (0)
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 200
Down Delay (ms): 200

802.3ad info
LACP rate: fast
Min links: 0
Aggregator selection policy (ad_select): stable
System priority: 65535
System MAC address: 70:10:6f:51:88:8c
Active Aggregator Info:
    Aggregator ID: 2
    Number of ports: 1
    Actor Key: 9
    Partner Key: 14
    Partner Mac Address: 54:4b:8c:c9:51:c0

Slave Interface: eth0
MII Status: up
Speed: 1000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 70:10:6f:51:88:8c
Slave queue ID: 0
Aggregator ID: 1
Actor Churn State: churned
Partner Churn State: churned
Actor Churned Count: 1
Partner Churned Count: 1
details actor lacp pdu:
system priority: 65535
system mac address: 70:10:6f:51:88:8c
port key: 9
port priority: 255
port number: 1
port state: 71
details partner lacp pdu:
system priority: 65535
system mac address: 00:00:00:00:00:00
oper key: 1
port priority: 255
port number: 1
port state: 1

Slave Interface: eth1
MII Status: up
Speed: 1000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 70:10:6f:51:88:8d
Slave queue ID: 0
Aggregator ID: 2
Actor Churn State: none
Partner Churn State: none
Actor Churned Count: 0
Partner Churned Count: 0
details actor lacp pdu:
system priority: 65535
system mac address: 70:10:6f:51:88:8c
port key: 9
port priority: 255
port number: 2
port state: 63
details partner lacp pdu:
system priority: 127
system mac address: 54:4b:8c:c9:51:c0
oper key: 14
port priority: 127
port number: 29
port state: 63

Network Interfaces

auto eth0
iface eth0 inet manual
bond-master bond0

auto eth1
iface eth1 inet manual
bond-master bond0

auto bond0
iface bond0 inet manual
    bond_miimon 100
    bond_mode 802.3ad
    bond-downdelay 200
    bond-updelay 200
    bond-slaves none

auto vlan520
iface vlan520 inet static
    address  62.xxx.xxx.40
    netmask  255.255.255.0
    gateway  62.xxxx.xxxx.1
    vlan-raw-device bond0

auto vlan4001
iface vlan4001 inet static
    address  172.16.1.1
    netmask  255.255.255.0
    vlan-raw-device bond0

/etc/modprobe.d/bonding.conf

alias bond0 bonding
    options bonding mode=4 miimon=100 lacp_rate=1

Any help will be appreciated.

Thanks, Ash

1 Answers1

2

Please refer to the following article: https://access.redhat.com/solutions/4122011

The short answer is that it's related to the last kernel update. They suspect the following commit to be related to the LACP issue: https://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git/commit/?id=ea53abfab960909d622ca37bcfb8e1c5378d21cc

Until the solution will become available it makes sense booting to the older kernel. The issue stated happening as per the following version on the redhat based OSs:

kernel-3.10.0-957.1.3.el7

I will try to keep this post up-to-date as it looks like the last kernel update affected quite a bit of users.

Additional Reference:

https://patchwork.ozlabs.org/patch/437496/

Dmitriy Kupch
  • 471
  • 2
  • 6
  • 2
    This is a restricted access article, it can't help anybody. – wazoox Jun 25 '19 at 16:57
  • All you need to know is written in the second line of the page I refer to: "Solution In Progress" Currently there is no solution rather then reverting to the previous kernel. Once solution will become available you will be able to get a patch for your OS. There are more articles available for free related to the same issue, I just picked the official one. If you don't need redhat subscription probably you are not using one. I don't think that word "anybody" is accurate in your statement. – Dmitriy Kupch Jun 26 '19 at 17:26
  • 2
    "Solution in progress" isn't a clear explanation. Your answer doesn't qualify as a proper one as per this site guideline. Please edit it and explain clearly what it's about in the text of your answer, so that the answer isn't only available on some external website with uncertain accessibility. And please be polite with site users, also as per stackexchange guidelines. – wazoox Jun 26 '19 at 21:40