1

A while ago we discovered an issue with most of our NICs in the data center - touching the cables just leads to the ports of the NICs going offline for a couple of seconds. All the NICs are Broadcom or Intel, all bought with servers or separately at Dell.

Last couple of months we did some tests with new servers - the first few months the NICs work without any issues, but after a couple of months, the issues become present also there. We already tried to replace cables (recommended by our provider in the data center and other people we know working as admins), but no luck - the issue persists.

Seriously, I cannot imagine, that all the NICs by popular brands become "faulty" after a couple of months, but on the other hand - trying out around 5-6 different cables by good vendors result in the same. We even bought new Broadcom NICs and installed them in one server, which was affected by this issue - brought it back online, touched the cable -> link gone for ~5 secs.

Anyone heard of this issue before or was even affected by this and found a solution?

mtbftw
  • 11
  • 2
  • Copper or optical? Does a hardware cable tester show it in spec, or has testing been limited to NICs in servers? – John Mahowald Aug 20 '21 at 13:19
  • Copper only. We also run some opticals, but no issues there. In a cable tester everything is good - we can pull/push/touch cables - they stay green. We also reported it to Dell, but they didn't hear about it before. Also our DC provider didn't. – mtbftw Aug 21 '21 at 11:12
  • What is the temperature on the hot side? Does it vary significantly over time? – Michael Hampton Aug 21 '21 at 18:10
  • The NICs are "pretty hot" under load, but I would say as usual for 10G NICs. Can't tell you the exact temperature, but I would guess around 60°C (again, under load). It doesn't vary, just staying constant. iDRACs do not report any anomaly temperatures, also we have temperature sensors in racks installed - usual rack temperature is about 21-22°C. This was our first idea: cables become unstable because of the heat. We tested it with other cables which were not installed and the issue was instantly there (without influence by the heat of the NICs). – mtbftw Aug 23 '21 at 15:15

0 Answers0