How to check HDD in Linux for bad blocks

Question

Since 2 days one mysql server can't start. I get this every 5 secs in the syslog:

Dec 17 09:24:35 backup kernel: [  681.132013] ata2.00: exception Emask 0x0 SAct 0x50000 SErr 0x0 action 0x0
Dec 17 09:24:35 backup kernel: [  681.132046] ata2.00: irq_stat 0x40000008
Dec 17 09:24:35 backup kernel: [  681.132071] ata2.00: failed command: READ FPDMA QUEUED
Dec 17 09:24:35 backup kernel: [  681.132105] ata2.00: cmd 60/20:80:00:e6:4d/00:00:78:00:00/40 tag 16 ncq 16384 in
Dec 17 09:24:35 backup kernel: [  681.132105]          res 41/40:20:00:e6:4d/00:00:78:00:00/00 Emask 0x409 (media error) <F>
Dec 17 09:24:35 backup kernel: [  681.132167] ata2.00: status: { DRDY ERR }
Dec 17 09:24:35 backup kernel: [  681.132183] ata2.00: error: { UNC }
Dec 17 09:24:35 backup kernel: [  681.165698] ata2.00: configured for UDMA/133
Dec 17 09:24:35 backup kernel: [  681.165714] sd 1:0:0:0: [sdb] Unhandled sense code
Dec 17 09:24:35 backup kernel: [  681.165717] sd 1:0:0:0: [sdb]
Dec 17 09:24:35 backup kernel: [  681.165719] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Dec 17 09:24:35 backup kernel: [  681.165722] sd 1:0:0:0: [sdb]
Dec 17 09:24:35 backup kernel: [  681.165723] Sense Key : Medium Error [current] [descriptor]
Dec 17 09:24:35 backup kernel: [  681.165727] Descriptor sense data with sense descriptors (in hex):
Dec 17 09:24:35 backup kernel: [  681.165729]         72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00
Dec 17 09:24:35 backup kernel: [  681.165738]         78 4d e6 00
Dec 17 09:24:35 backup kernel: [  681.165742] sd 1:0:0:0: [sdb]
Dec 17 09:24:35 backup kernel: [  681.165744] Add. Sense: Unrecovered read error - auto reallocate failed
Dec 17 09:24:35 backup kernel: [  681.165747] sd 1:0:0:0: [sdb] CDB:
Dec 17 09:24:35 backup kernel: [  681.165748] Read(16): 88 00 00 00 00 00 78 4d e6 00 00 00 00 20 00 00
Dec 17 09:24:35 backup kernel: [  681.165759] end_request: I/O error, dev sdb, sector 2018371072
Dec 17 09:24:35 backup kernel: [  681.165802] ata2: EH complete
Dec 17 09:24:41 backup /etc/mysql/debian-start[9912]: Upgrading MySQL tables if necessary.
Dec 17 09:24:41 backup /etc/mysql/debian-start[9916]: /usr/bin/mysql_upgrade: the '--basedir' option is always ignored
Dec 17 09:24:41 backup /etc/mysql/debian-start[9916]: Looking for 'mysql' as: /usr/bin/mysql
Dec 17 09:24:41 backup /etc/mysql/debian-start[9916]: Looking for 'mysqlcheck' as: /usr/bin/mysqlcheck
Dec 17 09:24:41 backup /etc/mysql/debian-start[9916]: FATAL ERROR: Upgrade failed
Dec 17 09:24:41 backup /etc/mysql/debian-start[9930]: Checking for insecure root accounts.

dmesg:

[  721.604068] ata2.00: exception Emask 0x0 SAct 0x600000 SErr 0x0 action 0x0
[  721.604102] ata2.00: irq_stat 0x40000008
[  721.604127] ata2.00: failed command: READ FPDMA QUEUED
[  721.604161] ata2.00: cmd 60/20:a8:00:e6:4d/00:00:78:00:00/40 tag 21 ncq 16384 in
[  721.604161]          res 41/40:20:00:e6:4d/00:00:78:00:00/00 Emask 0x409 (media error) <F>
[  721.604223] ata2.00: status: { DRDY ERR }
[  721.604239] ata2.00: error: { UNC }
[  721.630858] ata2.00: configured for UDMA/133
[  721.630875] sd 1:0:0:0: [sdb] Unhandled sense code
[  721.630878] sd 1:0:0:0: [sdb]
[  721.630880] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[  721.630882] sd 1:0:0:0: [sdb]
[  721.630884] Sense Key : Medium Error [current] [descriptor]
[  721.630887] Descriptor sense data with sense descriptors (in hex):
[  721.630889]         72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00
[  721.630898]         78 4d e6 00
[  721.630902] sd 1:0:0:0: [sdb]
[  721.630905] Add. Sense: Unrecovered read error - auto reallocate failed
[  721.630907] sd 1:0:0:0: [sdb] CDB:
[  721.630908] Read(16): 88 00 00 00 00 00 78 4d e6 00 00 00 00 20 00 00
[  721.630919] end_request: I/O error, dev sdb, sector 2018371072
[  721.630962] ata2: EH complete
[  721.673419] init: mysql main process (10229) terminated with status 1
[  721.673442] init: mysql main process ended, respawning

How can I check if the HDD (Software RAID 1) has some issues? I've tried this:

# smartctl -H /dev/sdb
smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.13.0-35-generic] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

Looks good to me...

I imagine the down vote was because your actual question has nothing to do with MySQL and the question title would be more accurate if it was something like "How to check HDD in Linux for bad blocks?" However you should also specify what filesystem type you are using, eg ext4, btrs, zfs or others. In the body you should also say if you are or what type of RAID you are using. — BeowulfNode42, Dec 17 '14 at 08:50
You are right. I changed the title. The disk is in a software RAID 1 and the partition is using ext4. — sme, Dec 17 '14 at 10:13

score 1 · Accepted Answer · answered Dec 17 '14 at 08:50

1

Your harddisk has issues, change it and restore from backup.

SMART is not always reliable.

answered Dec 17 '14 at 08:50

Sven

98,649
14
180
226

score 0 · Answer 2 · answered Dec 17 '14 at 13:00

0

That disk is dying, seeing ata commands is not great. You can use smartctl to a do long test:

smartctl --test=long /dev/sdb

But if your running this in RAID1 using MDRAID, id look to replace it to be honest as it doesnt look great - unless its going via a raid card / sata extender, in which case trying plugging it directly into the mobo.

answered Dec 17 '14 at 13:00

Yes, I replace the disk and readded everything into the RAID. Works good now. – sme Dec 18 '14 at 12:38

How to check HDD in Linux for bad blocks

2 Answers2