0

I am trying to troubleshoot an issue with our servers on a customer network. We do not support the network infrastructure.

For example, I have 2 nodes(WS2019) in the Failover Cluster. Each NODE have 2 nics and connected to swt 1 and swt 2 for fault tolerance. The NICs of NODES are connected to a NIC Teaming in switch independent-dynamic mode. VM uses an external NIC for connecting to the network.

While I am migrating a VM(WS2019) to another NODE I receive an error on the switch: sw_matm-4-macflap_notif with the MAC address of my VM that live migrated to the different node and a port where mac-address flapping occurs.

I cannot find the issue of it and troubleshoot it.

Is there any information that will help me to resolve the issue I encountered?

Andro Leo
  • 9
  • 1
  • 1
    It seems like this would be an expected and temporary error. Is it creating a long term issue? If not, why do you want/need to troubleshoot it? – joeqwerty Jan 25 '21 at 18:16
  • as I received feedback about this error it seems that it a big issue for the network team. and this error appears more than once. errors appear a lot on the switch – Andro Leo Jan 25 '21 at 19:33
  • 1
    Do you use Hyper-V? By default Hyper-V uses dynamic MAC addressing on VMs. Perhaps it has been set to static for the migrated VM - which is a reasonable choice, especially on Linux guests. You could try migrating a dynamic MAC VM to check if the same error occurs. – Krackout Jan 26 '21 at 09:57

1 Answers1

0

I feel this behaviour is normal and expected for switch-independent teaming mode.

According to Microsoft docs:

With Switch Independent mode, the switch or switches to which the NIC Team members are connected are unaware of the presence of the NIC team and do not determine how to distribute network traffic to NIC Team members - instead, the NIC Team distributes inbound network traffic across the NIC Team members.

How it performs a distribution? It sends target's VM egress (outgoing) packets via corresponding team member in the hope switches see its MAC address and learn it was on the port that team member is connected to. Reply packets directed to that MAC address will then go into VM through that port. However, Hyper-V also does some form of load balancing:

When you use Switch Independent mode with Dynamic distribution, the network traffic load is distributed based on the TCP Ports address hash as modified by the Dynamic load balancing algorithm. The Dynamic load balancing algorithm redistributes flows to optimize team member bandwidth utilization so that individual flow transmissions can move from one active team member to another.

I.e. sometimes that MAC may disappear on that port and appear on the port where another team member is connected, to free former port for other traffic (so, this VM is being "balanced out" of this overloaded port to the less loaded). Network indeed sees that as "MAC address flapping", because MAC addresses of VMs move back and forth between set of ports, where NIC team members of it's host node are connected.

During migration, there may be a short period of "flapping" of MAC between nodes, after which things mush settle. When VM is finally running on node, its MAC begins to float between ports where new node NICs are connected.

This moving between team members could be disabled by disabling a load balancing. The move to another set of ports in case of migration is inevitable.

Network engineers who administer networks where clusters are present must be aware of this feature. If this is undesirable, a proper way to address this is to use clustering-aware stackable switches and correctly configure switch-dependent teaming modes.

Nikita Kipriyanov
  • 10,947
  • 2
  • 24
  • 45