We had a disk in a raid go bad and we are suspicious another one might have minor errors. So to be safe we are trying to recover both disks using ddrescue. The various help pages recommend a two-to-three pass copy, first doing a no-split logged pass, then going back over the error sectors. e.g.
ddrescue --no-split --force /dev/sdc /dev/sdb logfile
ddrescue --direct --max-retries=3 /dev/sdc /dev/sdb logfile
then running the second again with --retrim
added if there were still errors.
The problem is, I can see the initial pass occasionally slowing so I checked the dmesg log and I can see the same types of IO errors (Medium Error tag#25 Sens: Unrecovered read error
) showing up in the system log, but ddrescue is not registering any errors in it's status.
UPDATE
ok, ddrescue is now showing 2 errors, but I'm showing more than 2 in the system log and none were showing in ddrescue when I saw the first few errors appear in the system log.
What I need to know is, if the syntax of the second command above only checks sectors ddrescue logged as bad, and if I should try to re-run the first command with some other flags such as --direct
on that pass also. (I'm wondering if something in the drive firmware may be preventing ddrescue from seeing all errors)
SW
Addendum
Upon running a retrim, I'm monitoring it as errors in re-read passes lowered down to 285. It is now reading 291. I thought the idea of the latter passes was to recover error sectors specifically and did not expect that the number would do anything but go down. What am I missing here?