-1

The setup is:

  • FUJITSU PRIMERGY TX300 S7
  • RAID Ctrl SAS 6g 5/6 512mb (d2616)
  • Windows Server 2012 64 bit
  • Volume C: RAID 1 (2 HDDs)
  • Volume D: RAID 5 (8 HDDs)

The problem is:
When we do something with a large amount of files on volume D:, at first everything is OK, but after several minutes speed goes drastically down (if it's deleting - it goes from 100 files/sec to 1 file/sec; if it's copying - from 100 MB/sec to 15 MB/sec).
Sometimes volume D: becomes inaccessible (it is still visible in Explorer, but the used space bar disappears).
And sometimes the system freezes so hard - it even stops repying to pings.

We thought that it might be something to do with caching, but we can't disable it, we get the "windows could not change the write-caching setting for the device" error.

How do we diagnose/fix the problem? Please help

alexander.polomodov
  • 1,068
  • 3
  • 10
  • 14
real_sm
  • 112
  • 2
  • 14
  • Define D Raifd 5 with WHAT HDD? Do you have a BBU on the Raid card? Without that and with slow discs.... what do you expect? My Fiat Panda is not as fast as a ferrari? And you raid card is no caching like at all? – TomTom Jul 12 '18 at 15:51
  • "Define D Raifd 5 with WHAT HDD?" - what do you mean? – real_sm Jul 12 '18 at 16:21
  • I mean what i adk. What HDD does your server have? Unless they are high end super fast SAS style disks, you got what you did likely not even know you paid for. Slow discs, which do not magically get faster in a slow raid 5 on a raid controller without write cache. – TomTom Jul 12 '18 at 16:22
  • "Do you have a BBU on the Raid card? Without that and with slow discs.... what do you expect? My Fiat Panda is not as fast as a ferrari? And you raid card is no caching like at all?" - we don't have BBU. We don't expect to have a ferrari, we want at least to move. Why does the process starts OK? Why does caching has influence on deleting process? Why does the used space bar disappears? Why does the system freeze? – real_sm Jul 12 '18 at 16:25
  • SATA2 7200 RPM HDDs. – real_sm Jul 12 '18 at 16:27
  • So, slow discs. Ok, what do you expect? Take a raid calculator and be shocked about how slow they are in a Raid 5. And oyu stilldid not talk about model (cache size makes a diffference) and size. What about you ask on superuser.com - here people expect you to actually provide relevant information. – TomTom Jul 12 '18 at 16:51
  • Questions seeking installation, configuration or diagnostic help must include the desired end state, the specific problem or error, sufficient information about the configuration and environment to reproduce it, and attempted solutions. Questions without a clear problem statement are not useful to other readers and are unlikely to get good answers. – TomTom Jul 12 '18 at 16:52
  • @TomTom sorry, I there are not 8 HDD in RAID 5, but 6 HDDs: 5 Seagate ST2000NM0011, 1 WD WD2004FBYZ. – real_sm Jul 12 '18 at 17:12
  • The desired end state is: system doesn't freeze while deleting/copying files, volume D: is always accessible, the used space bar doesn't disappear from Explorer. – real_sm Jul 12 '18 at 17:14
  • You run a 2tb dis size Ride 5? Did you ever do the math - that has a high chance to totally fail when one disc dies. Raid 6 minimum for that. And yes, those discs aare slow. VERY slow. Live with it. Get a raid controller with some caching capabilities and add some TB SSD in front, or move to a Raid Less Setup that allows you to use some M.2 SSD as cache. Your setup is made to be slow and that is what you now got. – TomTom Jul 12 '18 at 17:19
  • @TomTom OK, so you say it's slow, calculators and books say it's 20-30% speed drop. What we have is system freezes and inaccessible volume. How do we fix that? – real_sm Jul 12 '18 at 17:37
  • It is a lot more in speed drop. Do you have a BBU? Do you have caching enabled on the raid controller(use the raid controller tools, not windows). – TomTom Jul 12 '18 at 17:56
  • As I said earlier, we don't have BBU. That's why caching is disabled in RAID software, – real_sm Jul 12 '18 at 18:56
  • First thing to change. Makes a HUGH difference. – TomTom Jul 12 '18 at 19:18
  • BBU costs 170 EUR. What if we buy it and freezes will continue to happen? – real_sm Jul 12 '18 at 20:37
  • @TomTom isn't caching about something that is used more than 1 time? We try to delete files that weren't accessed several months before, how possibly can caching help to fix the freezes in this case? – real_sm Jul 17 '18 at 07:12
  • So, you think the metadata of the disc.... see? There is a lot more than the files and some of the structures you hit. Also BBU write cache means that the computer gets an "ok, change is commited" while the disc has not written it out yet. Significant speed boost. – TomTom Jul 17 '18 at 07:21

1 Answers1

0

I followed TomTom's advice in the comments and plugged your disk information from the comments into the Free RAID Calculator:

Capacity: 10000 GB, Speed gain: 5x read speed, no write speed gain, Fault tolerance: 1-drive failure

So your write speed is at 7200 RPM, which is slow but probably doesn't account for what you're seeing unless you're really pounding the disks.

I can see a couple of possibilities:

  1. Your RAID is in a degraded state (has a dead drive). Copying files off of a RAID 5 array in a degraded state can cause the symptoms you describe. Installing the latest MegaRAID Storage Manager, which you can get here under "Management Software and Tools", might help you diagnose the state of your array.
  2. You're running into something that I used to run into a lot in 2008 and 2008 R2: The Windows Dynamic Cache service has less-than-optimal settings for your environment. Basically, to speed up file writes it loads the file into memory and consumes all the memory on the box, causing it to become unresponsive and display the symptoms you describe. Those settings could be changed with a Microsoft download. The download is only for 2003 and 2008, though.
  3. You're doing something cruel and unusual, disk-IO-wise, that your slow disks just can't handle (imaging a bazillion PCs without multicast? writing a bazillion video streams simultaneously?).

At any rate, it's hard to tell based on the information you've provided but hopefully this will get you pointed in the right direction.

Katherine Villyard
  • 18,550
  • 4
  • 37
  • 59
  • 1
    Also not sure how his NCQ settings are. WIthout NCQ enabled... ouch. – TomTom Jul 12 '18 at 19:18
  • 1. Both RAIDs are OK. We have RAID Management software installed. – real_sm Jul 12 '18 at 19:57
  • 2. I will look into that. Though it seems unlikely, because as I see from Zabbix graphs, we had minimum 8 GB RAM free (out of 24 GB) at most loaded times. 3. It's a file server and FTP server with maximum 10 clients connected and working with 20 files. Strange things begin to happen if a user (1 user) starts to manage a lot of data (not in number, but in gigabytes). – real_sm Jul 12 '18 at 20:03