-2

We're a student association which have purchased an dedicated server to host game servers and similar. The server seems to experience occasional Disk-I/O bottlenecks. This is especially true when starting up the services, but other times as well. Even tasks such as opening Server Manager, Control Panel etc. can take a few seconds before the window shows.

  • Server Model: FUJITSU FUJITSU PY RX100 S8 Xeon E3-1220v3 4xSFF

  • Hard Drive: WD Red 750GB NAS Harddrive.

    • SATA 3 drive rated at 6Gb/s

Due to budget constraints, we could only afford one the HDD. So we can not run our server in RAID. We plan on updating obviously for our heavy use.

Is the disk really the bottleneck? Will getting a faster or more disks solve our problem?

Additionally: we tried to defragment the disk while essentially idle, but though the disk is less than half full, and after we left it running all night, it made no progress.

Here are some sceenshots from the resource manager during it downloading a Windows Update, and I also turned on some game servers. I see it hits like 10 MB/s and it looks like a bottle neck to me, but if it goes 10MB/s max, what causes this? I understand the HDD should be compatible of better speed?

Screenshot 1 Screenshot 2

EDIT: Re-factored post for clarity and updated information. See comments for more screenshots.

Blt950
  • 5
  • 2
  • 7
  • I think we woulds prefer to have images that are clearly readable. I just had to copy/ paste them into Paint.NET to be able to zoom in and I can only decipher the text vaguely. – TomTom May 09 '15 at 18:16
  • Am I right in assuming that you're chosen hardware is actually via a specific data-hosting environment, and that your budget can't afford their bloated support for a $90 SSD? Also, what's the total non-static disk space required by all these games? – Otheus May 09 '15 at 19:31
  • 1
    I noticed that the screenshot quality was worse than I thought. I reuploaded them not in first post with hopefully better Q. @Otheus The dedicated server is purchased and owned by us and it's connected to a student datacenter in our city. SSD was not an option at the time of purchase. Though we're considering upgrading the HDD if this is the main reason of our bottleneck, and we won't exclude the possiblity to choose SSD this time. We've around 6 game servers which use maximum 6 GB each, non static space would be max 3 additional GB. So we've plently of HDD space left. – Blt950 May 09 '15 at 20:22
  • "non static space would be max 3 additional GB" for each game or total? If these games use the disk for temporary data, prime target for ram-disk. Separate HDDs might be the better answer. You can have up to 3 more. Partition the games to use separate ones. Maybe you can scrounge around for older SATA2 drives which might be cheaper and would perform probably as well as you need to ... until you can afford the nice ones. – Otheus May 09 '15 at 20:29
  • To be on the safe side, I would say additional 3 GB _each_ server, though I already mean I'm over exaggerating it. As I mentioned in the other comments, this lag also exists when no servers are online. I can get it properly tested and screenshoted a bit later when I can shut them down. – Blt950 May 09 '15 at 20:33
  • As most questions would maybe need these results and observations I reply here. Here are a series of screenshots which hopefully are helpful in determine the source of my issue. [link](http://i58.tinypic.com/5lq539.png) [link](http://i58.tinypic.com/sdlfth.png) [link](http://i57.tinypic.com/2mw5uhk.png) [link](http://i57.tinypic.com/28223qx.png) [link](http://i57.tinypic.com/ja7vj6.png) – Blt950 May 09 '15 at 21:58
  • [link](http://i58.tinypic.com/2ag7yfc.png) Oddly now everything seems quite stable and not laggy. I'll try to make some activity on the game servers and see the results. EDIT: Doesn't seem to make any difference, I could hope the newly applied Windows update did something, or maybe it doesn't get slow before it has been online for few hours. I'll report back tomorrow and see if got any worse. – Blt950 May 09 '15 at 22:04
  • 2
    Please post a picture of the "Disk" tab in Task Manager, and expand the "Storage" group on the bottom. If you have constant Disk Queue Length, this means that your disk is not fast enough. A disk queue length > 1 (for 1 disk!) means it's not fast enough. – MichelZ May 10 '15 at 08:05
  • [link](http://oi59.tinypic.com/koldw.jpg) [link](http://i60.tinypic.com/2la4u9x.jpg) [link](http://i58.tinypic.com/svihav.jpg) – Blt950 May 10 '15 at 09:25
  • ** = I tried to launch Window's disk defrag over night, and it took really for ever, it still does the one I screenshot you here. It's like up to 10 seconds each file, not sure if it's supposed to take that long time. I've also tried to partition my disk, so I can move the servers to another partition, though that didn't work out. The partitioner just froze when I tried to shrink the C: disk, and nothing happened over 15 minutes, so I aborted it at the end. Seems the worst lag is not present anymore, but there is defently some kind of bottleneck somewhere causing all this.. – Blt950 May 10 '15 at 09:27
  • Clearafication: I ran Windows Disk Frag over night, and today morning it was stuck at: "Pass 1: 0% consolidated". So I aborted it. The one I use on screenshots now is very slow as well, it says "Remaining time > 1 day". – Blt950 May 10 '15 at 09:38
  • I'd say "Benchmarks are not too bad but should be much better". For chunks as small as 64kB, it still writes at 60 MB/s and reads at under 80. Unfortunately, it never gets better than that. That's well under what other benchmarkers showed. I'm disturbed by the DeFrag hanging... but not surprised by repartitioning not working (you really have to shutdown everything but essential serivces). I'm editing your original post to clarify the state of matters and draw attention of Windows experts. – Otheus May 10 '15 at 11:24
  • I agree with you, the benchmarks are not too bad. The defrag and partition issue does distrub me as well, it indicates something is wrong. Feel free to edit my original post, as it's a bit messy now :) Thank you for all help so far. – Blt950 May 10 '15 at 11:33
  • At least the SMART data looks good. Nothing physically wrong with the disk. The "Benchmark when all serv online" looks great! < 120 KB/s, < 2% activity time... The "All servers starting" look like the original problem -- but you would expect that during startup, and after startup, the system is idle again (right?). If it happens again, contact me via email. See my contact info for hints. – Otheus May 10 '15 at 11:34
  • Indeed, except of defrag and partition issue, it seems to be less lag and better now. I'll keep monitoring it and contact you if the lag issue comes back. Should I maybe post a new question focusing on the defrag/partition issue? – Blt950 May 10 '15 at 11:37
  • Do some searching first. You might find it's not uncommon. – Otheus May 10 '15 at 11:47
  • Will do! Windows 7 Pro Server = Windows Server 2012 r2? Because that's what I'm using. – Blt950 May 10 '15 at 11:50
  • I have no clue why any reasonable person would put this topic on hold. Probably typical kabal stupidity. – Otheus May 11 '15 at 22:44
  • "Questions on Server Fault must be about managing information technology systems in a business environment. Home and end-user computing questions may be asked on Super User". Reasoning of put on hold - though a student association is not a business or a home-end user, it's mix of both. So I don't understand that reasoning. – Blt950 May 12 '15 at 08:05

3 Answers3

2

Given that your disc is overloaded - yes, that is a bottlneck. One disc is notoriously crappy in IO (unless it is an SSD). That said - if you run game servers, do you really care about the starting time (which is rare)? Just "sit it out". Once the programs are started, things should be faster. For FAST IO - you ultimately want (need) an SSD.

TomTom
  • 51,649
  • 7
  • 54
  • 136
  • I disagree completely with your response. One disk versus two disks won't make a difference; two disks might even be a little slower. Three+ disks if uses in software RAID might also actually be a tad slower than one disk (but probably faster if a controller-based RAID - just depends.) For their particular server, controller-based RAID is an option. – Otheus May 09 '15 at 17:45
  • And w.r.t to SSD, this is exactly the kind of server I would not use an SSD with -- heavily write-based. SSD write-speeds are only marginally faster than those WD server drives, which contain lots of cache (16 MB for example). One benchmark shows this model drive (WD Red) gets up to 125 MB/s on read-write mixes. I've seen cheaper SSDs get less than that. – Otheus May 09 '15 at 17:45
  • You have zero clue what you talk of. SSD have 400-500mb/s random access write speeds. Show me one HD doing that with random (server load) IO. – TomTom May 09 '15 at 17:56
  • TomTom, believe me, I have not zero clue. While some SSDs have 300 MB/s (not mb/s -- I hope not!) speeds, some have much lower than that. – Otheus May 09 '15 at 18:00
  • For instance, the one of the Crucial 256 models is reported to have 4k-random write speeds of 77 MB/s and "mixed" speeds of 35 MB/s. http://ssd.userbenchmark.com/Crucial-MX100-256GB/Rating/2317 The OPs WD Red drive might actually out-perform that one. Compare with http://www.storagereview.com/wd_red_25_1tb_hdd_review_wd10jfcx Yes I know, comparison is not apples-to-apples. – Otheus May 09 '15 at 18:11
  • The starting time of the servers isn't really the bothersome part, even though it takes a while if I'm trying to start all at once - obviously. The lag occurs even if I'm not having any servers online, as pointed out in answer below. It's weird that one HDD without any activity other than Windows update and nothing else in background should make this kind of lag. Not to mention it takes sometimes ages just to install the updates. – Blt950 May 09 '15 at 20:17
-1

EDIT: TomTom pointed out to me that each process has waiting for its IO operations -- over 2 seconds! -- to complete.

You do have an I/O problem. It may or may not be an issue with the hard drive, which by all measures, is a perfectly good high performance drive. But the problem might just be that you have all these processes contending for the same resource. Process A needs to read 30 bytes then write 50, then wait for a second, then repeat, and Processed B, C, D, E, and F are all doing the same thing, and they all have their data scattered over the disk, you're hard drive will literally be spinning its wheels and getting very little work done.

Here's what you might be able to do: You're using less than 40% physical memory. Create a ram disk, and run the games from that. (Here's a howto guide: http://www.tekrevue.com/tip/create-10-gbs-ram-disk-windows/). IF the problem goes away, you know what you must do: Buy an SSD. Or, keep it running in RAM and use some kind of synchronization tool to constantly write backups to disk. The HDDs will handle THAT kind of load.

A quick google search turned up a product that you might find useful. For $49 you can get a full license to disk-image backed RAM drive from https://www.softperfect.com/products/ramdisk/. If that doesnt work, keeping googling.

EDIT 2: Disable the game services one at a time until the response-time catches up and the system is no longer lagging. If the number of game-services you disabled is less than half the total, you might be able to get away with adding a 2nd HDD and running half the games on the 2nd one.

Otheus
  • 439
  • 3
  • 12
  • 1
    The 100MB/S that the SATA 3 drives can sustain is single stream. Multiple servers may be random IO for those drives, and then you see one thing: The throughput go BRUTALLY down. You can't assume a MB/S number on a non SSD means anything for non-serial workloads. The disc is CLEARLY on the limit given the long term 100% utilization - it handles IO As fast as it can. – TomTom May 09 '15 at 18:11
  • Did you look at the (agreeable) not exactly high resolution images? The disc has a response time of more than 2 seconds on all shown IO. Not milliseconds. Keep saying that is not overloaded. – TomTom May 09 '15 at 18:15
  • 1
    LOL TomTom. Get angry much? The Windows "100% utilization" does NOT mean the disk is actually reading and writing constantly but that it constantly has requests. What we cannot tell from this chart is the latency each process has due to disk I/O. *That's* what's really important. Those drives the OP mentioned have 16 MB or more of cache, and for all we know, the are serving all the data and the disk's actuator on the disk is not moving at all. Unlikely of course. SSDs might actually not do much better in another scenario due to the wear-levelling and lots of small writes. – Otheus May 09 '15 at 18:21
  • Thanks @TomTom. I did not know that "Response Time" was available in Windows resource IO monitoring. Very nice addition they have there. – Otheus May 09 '15 at 18:35
  • 1
    Maybe I forgot to point out that the lag/slow occurs also when the servers are shut off. Obviously when I'm in the progress of starting all servers again after a computer restart, it takes a while, though useually it works fine just sit out and wait. This is a screenshot of only having Windows update downloading in the background. I'll also look into the RAM partitions soon, though I don't think it's the source of problem as it lags from the moment is starts up. http://i59.tinypic.com/35d9saf.jpg – Blt950 May 09 '15 at 20:14
  • 1
    Automatic updates on a server environment = VERY BAD IDEA. Shut down the game servers, wait for windows update to complete.Check for more updates. Disable auotmatic updates. Start game servers back up. Get back to us :) – Otheus May 09 '15 at 20:21
  • @Otheus that's what we currently are doing. We only manualy update it once we have some maintenance on the server. Maybe around 4-8 times yearly I would say. – Blt950 May 09 '15 at 20:24
  • OK, I thought you were suggesting Windows update was going on "in the background" while you have the game servers running. I misunderstood. Still, windows update is very disk-intensive. Defragment after it finishes. What is it like when the system is really "idle"? – Otheus May 09 '15 at 20:32
  • 1
    Useually never, I did it this time just to present how it's working. I'll get the server to more idle state later tonight when I can shut off servers, then report back:) – Blt950 May 09 '15 at 20:34
  • 1
    Alright, the sevrer has been running over night and nothing has changed really. The lag I firstly encountered is gone, though it's not end of the issues, check commets in first post. – Blt950 May 10 '15 at 09:29
  • Good news indeed. You can take the money you saved on the SSD and go out for a steak dinner. – dimitri.p May 11 '15 at 22:17
-1

SSDs are not disk I/O "aspirins". And NAS drives are not exactly speed deamons either. If you are going to have a single drive and there is an expectation of speed, you start with a WD Black or Velociraptor.

As for the Control Panel taking a second or two to open, if it happens even without all the gaming servers started, there is a fundamental hardware issue to be resolved. (ie slow drive - Get rid of (replace/substitute/trade) the NAS drive).

It also helps to check in the windows Event viewer for any programs/services that keep throwing messages every second or so and either correct their issue or disable them, since constant error reporting doesn't make the machine run any faster.

Run a disk benchmark program like ATTO and compare numbers with and without all the gaming servers running and with other machines you have available (ie yours) to get an idea on how the server's drive is performing.

If the server came with an OEM version of Windows server you should also be able to get some sort of assistance from Fujitsu.

dimitri.p
  • 657
  • 3
  • 9
  • 1
    Control panel is useually alright. But for example now I took the time (once all servers are on) to open Dashboard, this took 51 seconds. Server Manger around 11-14 seconds, with loading off all groups and such around 25 seconds. I'll try to run an ATTO benchmark and come back with the results :) Server came without OS, so we have our own copy of Windows Server not OEM. I took an look in the event viewer but didn't find anything that helped me, though if you have more specific things and places I should look I can do that. – Blt950 May 09 '15 at 20:31
  • SD are exactly aspirins for IO. Not super cheap unsuitable for the job, but basically, having around 100x or more IO budget than a disc - yes, they help. I replaced a 8 disc Raid 10 Velociraptor array with a 6 disc raid 50 SSD and we went from (fully random) patially down to 30mb/second to a nice 1 gigabyte per second. – TomTom May 10 '15 at 05:20
  • Benchmark results are posted as comment in question post. An SSD would defently be a good investment at the end, though on budget it might be a bit hard. Though if we come to a conclusion that the HDD is the issue, we'll defently try to get an SSD. – Blt950 May 10 '15 at 09:31