1

We have some windows servers (2008 R2) and we are experiencing a strange network issue on them. The best way that I can describe it is that we are experiencing low throughput (but high bandwidth) only when operating over internet latency (20ms or more) networks.

Symptoms:

1.) The servers in question are located in two tier 1 datacenters with great, wide internet pipes.

2.) If we RDP onto servers in Data Center #1, we can run SpeedTest.NET test and get hundreds of meg down/up

3.) If we download a CentOS ISO image (basically a big file) on a server, we can get between 6 - 20 Megabytes/sec bandwidth

4.) If we connect to a web server we run on a FreeBSD 9.0 server (also in the data center) from another datacenter (so, freebsd in dc #1, windows server in dc #2) and download a big file (100s of MB) we also get great performance. (again, many megabytes/sec)

5.) If we connect from a server in DC #1 to another windows server in DC #1 we get awesome performance (nearly 1gb/sec).

6.) If we connect from a windows server box in DC#1 to a windows server box in DC#2 we get ~450 kilobytes/sec (really poor throughput). but we can open multiple browsers/windows and download as many times as we want (simultaneously) and still get exactly the same 450 kb/sec.

  1. We ran NTTTCP and a windows port of netperf between windows servers in DC#1 and DC#2 and, in every case we can find, we get the same throughput (~450 kb/sec) on each download.

8.) We ran a whole bunch of other tests that allow us to conclusively eliminate firewalls etc. as the source of our issue (also, see test 4, as the FreeBSD server is communicating though the same [hardware] stack as the windows servers...)

We tried disabling the TCP window scaling on a pair of windows servers to no effect.

So ServerFault community: it seems like it must be a windows TCP setting of some sort -- what causes this sort of issue?

EDIT

1.) We installed VMWare/CentOS on one of the windows server 2008R2 boxes having the issue, with bridged networking, guess what great throughput! So, VMWare image, on windows host, greate performance, go figure!

SvrGuy
  • 1,012
  • 3
  • 17
  • 29
  • I banged my head against a similar issue (http://serverfault.com/questions/278343/slow-transfers-over-distance) for a while until one day is just started going faster. You might try graphing TCP window sizes with wireshark to see if you are hitting the bandwidth delay product as that is the common culprit. In the end though, in my case, I think it was probably some packet loss or buffer bloat that the Windows tcp stack doesn't handle as well as Linux. – Kyle Brandt Apr 24 '12 at 02:04
  • I think out case is worse then yours though -- we are talking about 400kbs between two datacenters in the same city with ping times between them of 20ms... The difference between that and linux/freebsd is 10 - 40x – SvrGuy Apr 24 '12 at 02:44

1 Answers1

1

Try disabling the TCP Chimney and autotuning:

netsh int tcp set global chimney=disabled
netsh int tcp set global autotuninglevel=disabled
Chris McKeown
  • 7,168
  • 1
  • 18
  • 26
  • We tried both of these options and it had no effect. We also turned flow control on, again no effect. We tried all the permutations thereof, no effect. – SvrGuy Apr 24 '12 at 13:39