3

we are trying to setup a 40gbit connection between two servers and get weird cpu behaviour when using iperf. It is also only using around 10Gbit/s of the possible 40.

Server specs:

  • AMD EPYC 7413
  • 8x MultiBitECC 3200 MHz 16384 MB Memory
  • Supermicro H12SSL-CT
  • Intel XL710 40GBe
  • Ubuntu 20.04.3 LTS 5.4.0-84-gene

The Servers are connected directly to each other via fibre. No switches.

Example

host1# iperf -s
host2# iperf -c host1 -i 1 -t 120
[ ID] Interval Transfer Bandwidth
[ 3] 0.0- 1.0 sec 1.39 GBytes 12.0 Gbits/sec
[ 3] 1.0- 2.0 sec 1.00 GBytes 8.61 Gbits/sec
[ 3] 2.0- 3.0 sec 1.03 GBytes 8.88 Gbits/sec
[ 3] 3.0- 4.0 sec 1.04 GBytes 8.92 Gbits/sec
[ 3] 4.0- 5.0 sec 1021 MBytes 8.56 Gbits/sec
[ 3] 5.0- 6.0 sec 1.05 GBytes 9.01 Gbits/sec
[ 3] 6.0- 7.0 sec 1.02 GBytes 8.78 Gbits/sec
[ 3] 7.0- 8.0 sec 1.02 GBytes 8.74 Gbits/sec
[ 3] 8.0- 9.0 sec 1.01 GBytes 8.69 Gbits/sec
[ 3] 9.0-10.0 sec 1.02 GBytes 8.75 Gbits/sec
[ 3] 10.0-11.0 sec 1.05 GBytes 9.03 Gbits/sec
[ 3] 11.0-12.0 sec 1015 MBytes 8.51 Gbits/sec
[ 3] 12.0-13.0 sec 1.02 GBytes 8.72 Gbits/sec
[ 3] 13.0-14.0 sec 1014 MBytes 8.51 Gbits/sec
[ 3] 14.0-15.0 sec 974 MBytes 8.17 Gbits/sec
[ 3] 0.0-15.0 sec 15.6 GBytes 8.92 Gbits/sec

Around the internet I found the official performance tuning guide from AMD and something from fasterdata.es.net

They suggest to make certain system setting like changing the CPU governor and tcp buffer. I did the changes accordingly and only got 1Gbit/s improvement.

When I checked the CPU clock speed the CPU always clocked down to around 400MHZ when running iperf.

Any suggestion to why either iperf send the CPU sleeping or how I could improve single thread tcp transmission speed? Running multiple tcp threads utilizes the bandwidth better but is not our use case.

thank you

Flaep
  • 71
  • 6
  • You might want to ask this question in the Leve1Tech forum. Wendel frequently works with this crazy high throughput configurations and I remember other YouTuber's mentioning him when trying to diagnose similar issues. If you find an answer there, come back and post it here. – djsumdog Sep 17 '21 at 18:17
  • thank you for your reply, – Flaep Sep 20 '21 at 06:56
  • 1
    I opened a thread here [link](https://forum.level1techs.com/t/amd-epyc-7413-slow-down-to-arround-400mhz-when-running-iperf/17626) – Flaep Sep 20 '21 at 07:09
  • 1
    You are using only one tcp stream for the iperf test. A 40 Gbps link is internal 4x10Gbps. Use the -P setting to test with multiple parallel streams to get more than 10Gpbs. – Alexander Worlitschek Sep 22 '21 at 14:51
  • The [Intel: Linux Performance Tuning Guide](https://www.intel.com/content/www/us/en/products/details/ethernet/700-controllers/xl710-controllers/docs.html?q=tuning) contains a checklist. – anx Sep 28 '21 at 09:10

1 Answers1

2

I changed Global C-State control in BIOS from auto to disabled and set

tuned-adm profile network-throughput

I am not sure if it is the final solution but it works for now.

Edit:

In the end a bios update was necessary as well. tuned-adm does however still provide a performance increase.

Flaep
  • 71
  • 6
  • try HWE kernel, quite a few powersave patches upstreamed but not backported since 5.4.. – anx Sep 23 '21 at 03:26
  • 1
    thank you for your suggestion, but since this is a production system those kernels are a bit to edgy – Flaep Sep 28 '21 at 08:26
  • The HWE kernels are also longterm kernels and should generally be as stable as any other. You get to reboot once every six months or so but that's about it, and you probably are doing that anyway. There is also an HWE edge kernel, which is meant for testing the next HWE kernel. I think that's what you may have heard of? No need to use that though unless you want or _need_ to test something new. As for kernels, in general you may also want to try the lowlatency or lowlatency HWE kernels, for improved networking. – Michael Hampton Sep 28 '21 at 10:31