Host Server
item | Details |
---|---|
OS | Ubuntu 20.04.5 |
Kernel | 5.15.0-53 |
Driver Version | 5.15.0-56-generic |
Firmware Version | 220.0.57.0/pkg 22.00.07.60 |
SRIOV | VF: 8 |
Hardware
item | Details |
---|---|
CPU | Intel(R) Xeon(R) Gold 6226R CPU @ 2.90GHz |
Memory | 192GB |
Model | PowerEdge R640 |
NIC | BCM57412 NetXtreme-E 10Gb RDMA Ethernet Controller |
Private Cloud
item | Details |
---|---|
Openstack | Yoga |
Neutron | SR-IOV |
Hypervisor | KVM |
KVM Version | 4.2.1 |
Qemu Version | 4.2.1 |
Libvirt Version | 6.0.0 |
VM Instance
item | Details |
---|---|
OS | Ubuntu 20.04.5 |
Kernel | 5.15.0-53 |
CPU | 4 vCore |
Memory | 16GB |
NIC | NetXtreme-E Ethernet Virtual Function |
Driver Version | 5.15.0-56-generic |
Kubernetes
item | Details |
---|---|
Version | v1.23.12 |
CNI | Cilium v1.11.10 |
Host Routing | eBPF |
Tunnel Mode | VXLAN |
Issue
I have created multiple VMs using the SR-IOV network type through the Broadcom NIC and configured a cluster.
After the cluster configuration was completed, the metrics server pod failed to start.
Upon investigating the cause of the pod failure, I found an issue with the core DNS query failing.
The reason for the core DNS query failure was due to a health check failure between Cilium Agents.
In order to find the cause, I conducted tests in different environments as follows:
https://i.stack.imgur.com/pFgCD.png
Issue Details
Inability to communicate within a Kubernetes cluster when configuring worker nodes in VMs on different host servers.
Cluster Network Communication Test
Case #1
- Configuration: Broadcom NIC
- Worker Nodes: VM (VF)
- Test Result:
- Using Cilium CNI, transmission packets were confirmed at the VM and host server level, but packets were not detected at the Leaf Switch.
Case #2
- Configuration: Broadcom NIC, Intel NIC
- Worker Nodes: VM (VF)
- Test Result:
- Packets transmitted from the Intel NIC were successfully received by the Broadcom NIC. However, packets transmitted from the Broadcom NIC were not detected at the Leaf Switch.
# Intel NIC ( 10.172.102.93 )
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on any, link-type LINUX_SLL (Linux cooked v1), capture size 262144 bytes
16:11:50.113760 IP 10.172.102.93.60117 > 10.172.102.183.8472: OTV, flags [I] (0x08), overlay 0, instance 6
IP 10.244.5.126 > 10.244.6.221: ICMP echo request, id 1, seq 87, length 64
16:11:51.137755 IP 10.172.102.93.60117 > 10.172.102.183.8472: OTV, flags [I] (0x08), overlay 0, instance 6
IP 10.244.5.126 > 10.244.6.221: ICMP echo request, id 1, seq 88, length 64
16:11:52.161775 IP 10.172.102.93.60117 > 10.172.102.183.8472: OTV, flags [I] (0x08), overlay 0, instance 6
IP 10.244.5.126 > 10.244.6.221: ICMP echo request, id 1, seq 89, length 64
16:11:53.185763 IP 10.172.102.93.60117 > 10.172.102.183.8472: OTV, flags [I] (0x08), overlay 0, instance 6
IP 10.244.5.126 > 10.244.6.221: ICMP echo request, id 1, seq 90, length 64
16:11:54.209795 IP 10.172.102.93.60117 > 10.172.102.183.8472: OTV, flags [I] (0x08), overlay 0, instance 6
IP 10.244.5.126 > 10.244.6.221: ICMP echo request, id 1, seq 91, length 64
16:11:55.233780 IP 10.172.102.93.60117 > 10.172.102.183.8472: OTV, flags [I] (0x08), overlay 0, instance 6
IP 10.244.5.126 > 10.244.6.221: ICMP echo request, id 1, seq 92, length 64
16:11:56.257743 IP 10.172.102.93.60117 > 10.172.102.183.8472: OTV, flags [I] (0x08), overlay 0, instance 6
IP 10.244.5.126 > 10.244.6.221: ICMP echo request, id 1, seq 93, length 64
# Broadcom NIC ( 10.172.102.183 )
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on ens4, link-type EN10MB (Ethernet), capture size 262144 bytes
16:10:40.161741 IP 10.172.102.93.60117 > 10.172.102.183.8472: OTV, flags [I] (0x08), overlay 0, instance 6
IP 10.244.5.126 > 10.244.6.221: ICMP echo request, id 1, seq 20, length 64
16:10:40.161926 IP 10.172.102.183.50506 > 10.172.102.93.8472: OTV, flags [I] (0x08), overlay 0, instance 6
IP 10.244.6.221 > 10.244.5.126: ICMP echo reply, id 1, seq 20, length 64
16:10:41.185717 IP 10.172.102.93.60117 > 10.172.102.183.8472: OTV, flags [I] (0x08), overlay 0, instance 6
IP 10.244.5.126 > 10.244.6.221: ICMP echo request, id 1, seq 21, length 64
16:10:41.185895 IP 10.172.102.183.50506 > 10.172.102.93.8472: OTV, flags [I] (0x08), overlay 0, instance 6
IP 10.244.6.221 > 10.244.5.126: ICMP echo reply, id 1, seq 21, length 64
Case #3
- Configuration: Broadcom NIC
- Worker Nodes: Host (PF)
- Test Result:
- Communication was normal. No issues were detected. Cluster Network Communication Test - Configuration Changes
Disabling XDP
Communication remained impossible, similar to Case #1.
NIC Setting Changes
Settings:
Settings | Command |
---|---|
generic-segmentation-offload | ethtool -K ens4 gso off |
tx-udp_tnl-segmentation | ethtool -K ens4 tx-udp_tnl-segmentation off |
tx-udp_tnl-csum-segmentation | ethtool -K ens4 tx-udp_tnl-csum-segmentation off |
Result: Communication remained impossible, similar to Case #1.
VM Broadcom NIC Driver Version Change
- Previous: 5.15.0-58-generic
- Updated: 1.10.2-223.0.183.0
- Result: Communication remained impossible, similar to Case #1.
Does the Broadcom NIC Virtual Function Driver not support an environment that uses both Kubernetes and Cilium CNI in a VM environment using SR-IOV? Or is there a need for additional configuration?