0

Host Server

item Details
OS Ubuntu 20.04.5
Kernel 5.15.0-53
Driver Version 5.15.0-56-generic
Firmware Version 220.0.57.0/pkg 22.00.07.60
SRIOV VF: 8

Hardware

item Details
CPU Intel(R) Xeon(R) Gold 6226R CPU @ 2.90GHz
Memory 192GB
Model PowerEdge R640
NIC BCM57412 NetXtreme-E 10Gb RDMA Ethernet Controller

Private Cloud

item Details
Openstack Yoga
Neutron SR-IOV
Hypervisor KVM
KVM Version 4.2.1
Qemu Version 4.2.1
Libvirt Version 6.0.0

VM Instance

item Details
OS Ubuntu 20.04.5
Kernel 5.15.0-53
CPU 4 vCore
Memory 16GB
NIC NetXtreme-E Ethernet Virtual Function
Driver Version 5.15.0-56-generic

Kubernetes

item Details
Version v1.23.12
CNI Cilium v1.11.10
Host Routing eBPF
Tunnel Mode VXLAN

Issue

I have created multiple VMs using the SR-IOV network type through the Broadcom NIC and configured a cluster.

After the cluster configuration was completed, the metrics server pod failed to start.

Upon investigating the cause of the pod failure, I found an issue with the core DNS query failing.

The reason for the core DNS query failure was due to a health check failure between Cilium Agents.

In order to find the cause, I conducted tests in different environments as follows:

https://i.stack.imgur.com/pFgCD.png

Issue Details

Inability to communicate within a Kubernetes cluster when configuring worker nodes in VMs on different host servers.

Cluster Network Communication Test

Case #1

  • Configuration: Broadcom NIC
  • Worker Nodes: VM (VF)
  • Test Result:
  • Using Cilium CNI, transmission packets were confirmed at the VM and host server level, but packets were not detected at the Leaf Switch.

Case #2

  • Configuration: Broadcom NIC, Intel NIC
  • Worker Nodes: VM (VF)
  • Test Result:
  • Packets transmitted from the Intel NIC were successfully received by the Broadcom NIC. However, packets transmitted from the Broadcom NIC were not detected at the Leaf Switch.
# Intel NIC ( 10.172.102.93 )
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on any, link-type LINUX_SLL (Linux cooked v1), capture size 262144 bytes
16:11:50.113760 IP 10.172.102.93.60117 > 10.172.102.183.8472: OTV, flags [I] (0x08), overlay 0, instance 6
IP 10.244.5.126 > 10.244.6.221: ICMP echo request, id 1, seq 87, length 64
16:11:51.137755 IP 10.172.102.93.60117 > 10.172.102.183.8472: OTV, flags [I] (0x08), overlay 0, instance 6
IP 10.244.5.126 > 10.244.6.221: ICMP echo request, id 1, seq 88, length 64
16:11:52.161775 IP 10.172.102.93.60117 > 10.172.102.183.8472: OTV, flags [I] (0x08), overlay 0, instance 6
IP 10.244.5.126 > 10.244.6.221: ICMP echo request, id 1, seq 89, length 64
16:11:53.185763 IP 10.172.102.93.60117 > 10.172.102.183.8472: OTV, flags [I] (0x08), overlay 0, instance 6
IP 10.244.5.126 > 10.244.6.221: ICMP echo request, id 1, seq 90, length 64
16:11:54.209795 IP 10.172.102.93.60117 > 10.172.102.183.8472: OTV, flags [I] (0x08), overlay 0, instance 6
IP 10.244.5.126 > 10.244.6.221: ICMP echo request, id 1, seq 91, length 64
16:11:55.233780 IP 10.172.102.93.60117 > 10.172.102.183.8472: OTV, flags [I] (0x08), overlay 0, instance 6
IP 10.244.5.126 > 10.244.6.221: ICMP echo request, id 1, seq 92, length 64
16:11:56.257743 IP 10.172.102.93.60117 > 10.172.102.183.8472: OTV, flags [I] (0x08), overlay 0, instance 6
IP 10.244.5.126 > 10.244.6.221: ICMP echo request, id 1, seq 93, length 64
# Broadcom NIC ( 10.172.102.183 )
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on ens4, link-type EN10MB (Ethernet), capture size 262144 bytes
16:10:40.161741 IP 10.172.102.93.60117 > 10.172.102.183.8472: OTV, flags [I] (0x08), overlay 0, instance 6
IP 10.244.5.126 > 10.244.6.221: ICMP echo request, id 1, seq 20, length 64
16:10:40.161926 IP 10.172.102.183.50506 > 10.172.102.93.8472: OTV, flags [I] (0x08), overlay 0, instance 6
IP 10.244.6.221 > 10.244.5.126: ICMP echo reply, id 1, seq 20, length 64
16:10:41.185717 IP 10.172.102.93.60117 > 10.172.102.183.8472: OTV, flags [I] (0x08), overlay 0, instance 6
IP 10.244.5.126 > 10.244.6.221: ICMP echo request, id 1, seq 21, length 64
16:10:41.185895 IP 10.172.102.183.50506 > 10.172.102.93.8472: OTV, flags [I] (0x08), overlay 0, instance 6
IP 10.244.6.221 > 10.244.5.126: ICMP echo reply, id 1, seq 21, length 64

Case #3

  • Configuration: Broadcom NIC
  • Worker Nodes: Host (PF)
  • Test Result:
  • Communication was normal. No issues were detected. Cluster Network Communication Test - Configuration Changes

Disabling XDP

Communication remained impossible, similar to Case #1.

NIC Setting Changes

Settings:

Settings Command
generic-segmentation-offload ethtool -K ens4 gso off
tx-udp_tnl-segmentation ethtool -K ens4 tx-udp_tnl-segmentation off
tx-udp_tnl-csum-segmentation ethtool -K ens4 tx-udp_tnl-csum-segmentation off

Result: Communication remained impossible, similar to Case #1.

VM Broadcom NIC Driver Version Change

  • Previous: 5.15.0-58-generic
  • Updated: 1.10.2-223.0.183.0
  • Result: Communication remained impossible, similar to Case #1.

Does the Broadcom NIC Virtual Function Driver not support an environment that uses both Kubernetes and Cilium CNI in a VM environment using SR-IOV? Or is there a need for additional configuration?

kyleyoon
  • 1
  • 1
  • I'm not seeing why you think this would be related to Cilium. It seems you are saying communication between the two VMs is failing. Does it work if you remove Cilium? – pchaigno Aug 23 '23 at 22:17

0 Answers0