4

Is there a way to periodically verify that a linux software raid is valid and has no errors? Like a daemon that would scan all blocks and verify them.

Zoredache
  • 130,897
  • 41
  • 276
  • 420
Jackalheart
  • 91
  • 1
  • 5

2 Answers2

13

On Debian (and therefore Ubuntu) machines, cron runs:

/usr/share/mdadm/checkarray --cron --all --quiet

the first Sunday of the month. This does exactly what you want.

It basically boils down to:

# echo check > /sys/block/$array/md/sync_action

but with a lot of sanity around it. Steal it from your nearest Debian install, or from the mdadm source package.

David Pashley
  • 23,497
  • 2
  • 46
  • 73
  • On RHEL and Centos, there is /etc/sysconfig/raid-check which gets used to check of fix the arrays weekly. Found via http://pbraun.nethence.com/doc/sysutils_linux/mdadm.html – becomingwisest Jan 18 '12 at 01:47
3

From the the Linux Software Raid How To:

...basic example. Running:

mdadm --monitor --mail=root@localhost --delay=1800 /dev/md2

should release a mdadm daemon to monitor /dev/md2. The delay parameter means that polling will be done in intervals of 1800 seconds. Finally, critical events and fatal errors should be e-mailed to the system manager.

David
  • 3,555
  • 22
  • 17
  • +1 for mdadm. I'm using it too ;) – asdmin Aug 05 '09 at 08:28
  • 1
    The word "verify" with RAID arrays typically means "reading all the array components and making sure they can be read and the redundant data agrees". This is done by writing to the /sys/block/md*/md/sync_action file. The "mdadm --monitor" only checks the array status, it does not verify the data within the arrays. of course, the two go hand-in-hand. Because the verify may cause an array to go to degraded mode, which the operator needs to be alerted to, and "mdadm --monitor" will do that. Running regular verify is critical, otherwise you may find bad sectors during a future array rebuild. – Sean Reifschneider Sep 15 '09 at 15:33