0

Given that SMART is unreliable for predicting HDD failure, does anyone have a reliable alternative for automating quick identification of hard drives that are beginning to fail?

This is for Windows Server 2008 R2. I do not have the luxury of using ReFS.

My primary concern is backing up corrupted data for an extended period of time without knowing it is corrupted.

gravidThoughts
  • 197
  • 1
  • 8

2 Answers2

3

Use a filesystem which is capable of detecting and repairing corruption, such as ZFS or btrfs, or Windows ReFS.

Michael Hampton
  • 244,070
  • 43
  • 506
  • 972
2

Hard drives can die randomly and suddenly. SMART helps identify the ones that die slowly, but not the ones that die quickly.

When 'beginning to fail' and 'completely dead' are seconds apart, there is no warning.

If you are concerned about corrupt data going unnoticed you should use a hardware RAID solution that has a media patrol type feature which regularly scans all the drives looking for anything that is corrupt (for RAID5/6 it recalculates all the checksums to ensure they match what is supposed to be there).

You should also test your backups regularly to ensure they work properly. And keep not just the latest but a few older backups around for when a corrupt or deleted file isn't discovered for months.

Grant
  • 17,859
  • 14
  • 72
  • 103
  • The media patrol functionality you describe is what I am looking for. I have some non raid drives I would like to monitor using a service or scheduled task. A light weight version chkdsk that only scans. – gravidThoughts Mar 29 '14 at 18:38
  • Sudden failures will not have produced corrupt backups. I am primarily concerned with longer term degradation that is not being reported correctly by SMART, and resulting in unreadable sectors in my data. – gravidThoughts Mar 29 '14 at 18:41
  • @gravidthoughts chkdsk has a check only switch. For media patrol you need something to compare it to - hence the raid requirement. The computer can't tell if the data is corrupt if it doesn't know what it is supposed to look like. – Grant Mar 29 '14 at 18:44
  • Is the check only switch light on the resources ? – gravidThoughts Mar 29 '14 at 18:49
  • @gravidthougts it will use a fair bit of disk IO. Whether or not its acceptable performance wise will have to be tested on your specific hardware. Even so, chkdsk wont catch all possible corruption. – Grant Mar 29 '14 at 18:53