4

I have some servers that run Debian 8 with 8x800GB SSD configured as RAID6. All disks are connected to a LSI-3008 flashed to IT mode. In each server I also have a 2-disk pair as RAID1 for the OS.

current state

# dpkg -l|grep mdad
ii  mdadm                          3.3.2-5+deb8u1              amd64        tool to administer Linux MD arrays (software RAID)

# uname -a
Linux R5U32-B 3.16.0-4-amd64 #1 SMP Debian 3.16.7-ckt25-2 (2016-04-08) x86_64 GNU/Linux

# more /proc/mdstat
Personalities : [raid1] [raid6] [raid5] [raid4]
md2 : active raid6 sde1[1](F) sdg1[3] sdf1[2] sdd1[0] sdh1[7] sdb1[6] sdj1[5] sdi1[4]
      4687678464 blocks super 1.2 level 6, 512k chunk, algorithm 2 [8/7] [U_UUUUUU]
      bitmap: 3/6 pages [12KB], 65536KB chunk

md1 : active (auto-read-only) raid1 sda5[0] sdc5[1]
      62467072 blocks super 1.2 [2/2] [UU]
        resync=PENDING

md0 : active raid1 sda2[0] sdc2[1]
      1890881536 blocks super 1.2 [2/2] [UU]
      bitmap: 2/15 pages [8KB], 65536KB chunk

unused devices: <none>

# mdadm --detail /dev/md2
/dev/md2:
        Version : 1.2
  Creation Time : Fri Jun 24 04:35:18 2016
     Raid Level : raid6
     Array Size : 4687678464 (4470.52 GiB 4800.18 GB)
  Used Dev Size : 781279744 (745.09 GiB 800.03 GB)
   Raid Devices : 8
  Total Devices : 8
    Persistence : Superblock is persistent

  Intent Bitmap : Internal

    Update Time : Tue Jul 19 17:36:15 2016
          State : active, degraded
 Active Devices : 7
Working Devices : 7
 Failed Devices : 1
  Spare Devices : 0

         Layout : left-symmetric
     Chunk Size : 512K

           Name : R5U32-B:2  (local to host R5U32-B)
           UUID : 24299038:57327536:4db96d98:d6e914e2
         Events : 2514191

    Number   Major   Minor   RaidDevice State
       0       8       49        0      active sync   /dev/sdd1
       2       0        0        2      removed
       2       8       81        2      active sync   /dev/sdf1
       3       8       97        3      active sync   /dev/sdg1
       4       8      129        4      active sync   /dev/sdi1
       5       8      145        5      active sync   /dev/sdj1
       6       8       17        6      active sync   /dev/sdb1
       7       8      113        7      active sync   /dev/sdh1

       1       8       65        -      faulty   /dev/sde1

Problem

The RAID 6 array degrades semi-regularly, every 1-3 days or so. The reason for this is that one (any one) of its disks show up as faulty with the following error:

#dmesg -T
[Sat Jul 16 05:38:45 2016] sd 0:0:3:0: attempting task abort! scmd(ffff8810350cbe00)
[Sat Jul 16 05:38:45 2016] sd 0:0:3:0: [sde] CDB:
[Sat Jul 16 05:38:45 2016] Synchronize Cache(10): 35 00 00 00 00 00 00 00 00 00
[Sat Jul 16 05:38:45 2016] scsi target0:0:3: handle(0x000d), sas_address(0x500304801707a443), phy(3)
[Sat Jul 16 05:38:45 2016] scsi target0:0:3: enclosure_logical_id(0x500304801707a47f), slot(3)
[Sat Jul 16 05:38:46 2016] sd 0:0:3:0: task abort: SUCCESS scmd(ffff8810350cbe00)
[Sat Jul 16 05:38:46 2016] end_request: I/O error, dev sde, sector 2064
[Sat Jul 16 05:38:46 2016] md: super_written gets error=-5, uptodate=0
[Sat Jul 16 05:38:46 2016] md/raid:md2: Disk failure on sde1, disabling device.md/raid:md2: Operation continuing on 7 devices.
[Sat Jul 16 05:38:46 2016] RAID conf printout:
[Sat Jul 16 05:38:46 2016]  --- level:6 rd:8 wd:7
[Sat Jul 16 05:38:46 2016]  disk 0, o:1, dev:sdd1
[Sat Jul 16 05:38:46 2016]  disk 1, o:0, dev:sde1
[Sat Jul 16 05:38:46 2016]  disk 2, o:1, dev:sdf1
[Sat Jul 16 05:38:46 2016]  disk 3, o:1, dev:sdg1
[Sat Jul 16 05:38:46 2016]  disk 4, o:1, dev:sdi1
[Sat Jul 16 05:38:46 2016]  disk 5, o:1, dev:sdj1
[Sat Jul 16 05:38:46 2016]  disk 6, o:1, dev:sdb1
[Sat Jul 16 05:38:46 2016]  disk 7, o:1, dev:sdh1
[Sat Jul 16 05:38:46 2016] RAID conf printout:
[Sat Jul 16 05:38:46 2016]  --- level:6 rd:8 wd:7
[Sat Jul 16 05:38:46 2016]  disk 0, o:1, dev:sdd1
[Sat Jul 16 05:38:46 2016]  disk 2, o:1, dev:sdf1
[Sat Jul 16 05:38:46 2016]  disk 3, o:1, dev:sdg1
[Sat Jul 16 05:38:46 2016]  disk 4, o:1, dev:sdi1
[Sat Jul 16 05:38:46 2016]  disk 5, o:1, dev:sdj1
[Sat Jul 16 05:38:46 2016]  disk 6, o:1, dev:sdb1
[Sat Jul 16 05:38:46 2016]  disk 7, o:1, dev:sdh1
[Sat Jul 16 12:40:00 2016] sd 0:0:7:0: attempting task abort! scmd(ffff88000d76eb00)

Already tried

I have already tried the following, with no improvement:

  • increase /sys/block/md2/md/stripe_cache_size from 256 to 16384
  • increase dev.raid.speed_limit_min from 1000 to 50000

Need your help

Are these errors caused by mdadm configuration or the kernel or the controller?

Update 20160802

Follow the advice of ppetraki and others:

  • Use raw disk instead partition

    This doesn't solve the issue

  • Decrease chunk size

    The chunk size has beed modified to 128KB then 64KB but the RAID volume still degraded in few day. From dmesg is showing similar with previous error. I forget to try to reduce chunk size to 32KB.

  • Reduce number of RAID to 6 disks

    I've tried to destroy existing RAID, zeroing superblock on each disk and create RAID6 with 6 disks (in raw disk) and 64KB chunks. Decrease number of disk RAID seems make array live longer, around 4-7 days before degraded

  • Update the driver

I just update the driver to Linux_Driver_RHEL6-7_SLES11-12_P12 (http://www.avagotech.com/products/server-storage/host-bus-adapters/sas-9300-8e). Disk error still appear like below

[Tue Aug  2 17:57:48 2016] sd 0:0:6:0: attempting task abort! scmd(ffff880fc0dd1980)
[Tue Aug  2 17:57:48 2016] sd 0:0:6:0: [sdg] CDB:
[Tue Aug  2 17:57:48 2016] Synchronize Cache(10): 35 00 00 00 00 00 00 00 00 00
[Tue Aug  2 17:57:48 2016] scsi target0:0:6: handle(0x0010), sas_address(0x50030480173ee946), phy(6)
[Tue Aug  2 17:57:48 2016] scsi target0:0:6: enclosure_logical_id(0x50030480173ee97f), slot(6)
[Tue Aug  2 17:57:49 2016] sd 0:0:6:0: task abort: SUCCESS scmd(ffff880fc0dd1980)
[Tue Aug  2 17:57:49 2016] end_request: I/O error, dev sdg, sector 0

Just a few moments ago, I have array degraded. This time /dev/sdf and /dev/sdg show error "attempting task abort! scmd"

[Tue Aug  2 21:26:02 2016]  
[Tue Aug  2 21:26:02 2016] sd 0:0:5:0: [sdf] CDB:
[Tue Aug  2 21:26:02 2016] Synchronize Cache(10): 35 00 00 00 00 00 00 00 00 00
[Tue Aug  2 21:26:02 2016] scsi target0:0:5: handle(0x000f), sas_address(0x50030480173ee945), phy(5)
[Tue Aug  2 21:26:02 2016] scsi target0:0:5: enclosure logical id(0x50030480173ee97f), slot(5)
[Tue Aug  2 21:26:02 2016] scsi target0:0:5: enclosure level(0x0000), connector name(     ^A)
[Tue Aug  2 21:26:03 2016] sd 0:0:5:0: task abort: SUCCESS scmd(ffff88103beb5240)
[Tue Aug  2 21:26:03 2016] sd 0:0:5:0: attempting task abort! scmd(ffff88107934e080)
[Tue Aug  2 21:26:03 2016] sd 0:0:5:0: [sdf] CDB:
[Tue Aug  2 21:26:03 2016] Read(10): 28 00 04 75 3b f8 00 00 08 00
[Tue Aug  2 21:26:03 2016] scsi target0:0:5: handle(0x000f), sas_address(0x50030480173ee945), phy(5)
[Tue Aug  2 21:26:03 2016] scsi target0:0:5: enclosure logical id(0x50030480173ee97f), slot(5)
[Tue Aug  2 21:26:03 2016] scsi target0:0:5: enclosure level(0x0000), connector name(     ^A)
[Tue Aug  2 21:26:03 2016] sd 0:0:5:0: task abort: SUCCESS scmd(ffff88107934e080)
[Tue Aug  2 21:26:04 2016] sd 0:0:5:0: [sdf] CDB:
[Tue Aug  2 21:26:04 2016] Read(10): 28 00 04 75 3b f8 00 00 08 00
[Tue Aug  2 21:26:04 2016] mpt3sas_cm0:         sas_address(0x50030480173ee945), phy(5)
[Tue Aug  2 21:26:04 2016] mpt3sas_cm0:         enclosure logical id(0x50030480173ee97f), slot(5)
[Tue Aug  2 21:26:04 2016] mpt3sas_cm0:         enclosure level(0x0000), connector name(     ^A)
[Tue Aug  2 21:26:04 2016] mpt3sas_cm0:         handle(0x000f), ioc_status(success)(0x0000), smid(35)
[Tue Aug  2 21:26:04 2016] mpt3sas_cm0:         request_len(4096), underflow(4096), resid(-4096)
[Tue Aug  2 21:26:04 2016] mpt3sas_cm0:         tag(65535), transfer_count(8192), sc->result(0x00000000)
[Tue Aug  2 21:26:04 2016] mpt3sas_cm0:         scsi_status(check condition)(0x02), scsi_state(autosense valid )(0x01)
[Tue Aug  2 21:26:04 2016] mpt3sas_cm0:         [sense_key,asc,ascq]: [0x06,0x29,0x00], count(18)
[Tue Aug  2 22:14:51 2016] sd 0:0:6:0: attempting task abort! scmd(ffff880931d8c840)
[Tue Aug  2 22:14:51 2016] sd 0:0:6:0: [sdg] CDB:
[Tue Aug  2 22:14:51 2016] Synchronize Cache(10): 35 00 00 00 00 00 00 00 00 00
[Tue Aug  2 22:14:51 2016] scsi target0:0:6: handle(0x0010), sas_address(0x50030480173ee946), phy(6)
[Tue Aug  2 22:14:51 2016] scsi target0:0:6: enclosure logical id(0x50030480173ee97f), slot(6)
[Tue Aug  2 22:14:51 2016] scsi target0:0:6: enclosure level(0x0000), connector name(     ^A)
[Tue Aug  2 22:14:51 2016] sd 0:0:6:0: task abort: SUCCESS scmd(ffff880931d8c840)
[Tue Aug  2 22:14:52 2016] sd 0:0:6:0: [sdg] CDB:
[Tue Aug  2 22:14:52 2016] Synchronize Cache(10): 35 00 00 00 00 00 00 00 00 00
[Tue Aug  2 22:14:52 2016] mpt3sas_cm0:         sas_address(0x50030480173ee946), phy(6)
[Tue Aug  2 22:14:52 2016] mpt3sas_cm0:         enclosure logical id(0x50030480173ee97f), slot(6)
[Tue Aug  2 22:14:52 2016] mpt3sas_cm0:         enclosure level(0x0000), connector name(     ^A)
[Tue Aug  2 22:14:52 2016] mpt3sas_cm0:         handle(0x0010), ioc_status(success)(0x0000), smid(85)
[Tue Aug  2 22:14:52 2016] mpt3sas_cm0:         request_len(0), underflow(0), resid(-8192)
[Tue Aug  2 22:14:52 2016] mpt3sas_cm0:         tag(65535), transfer_count(8192), sc->result(0x00000000)
[Tue Aug  2 22:14:52 2016] mpt3sas_cm0:         scsi_status(check condition)(0x02), scsi_state(autosense valid )(0x01)
[Tue Aug  2 22:14:52 2016] mpt3sas_cm0:         [sense_key,asc,ascq]: [0x06,0x29,0x00], count(18)
[Tue Aug  2 22:14:52 2016] end_request: I/O error, dev sdg, sector 16
[Tue Aug  2 22:14:52 2016] md: super_written gets error=-5, uptodate=0
[Tue Aug  2 22:14:52 2016] md/raid:md2: Disk failure on sdg, disabling device. md/raid:md2: Operation continuing on 5 devices.
[Tue Aug  2 22:14:52 2016] RAID conf printout:
[Tue Aug  2 22:14:52 2016]  --- level:6 rd:6 wd:5
[Tue Aug  2 22:14:52 2016]  disk 0, o:1, dev:sdc
[Tue Aug  2 22:14:52 2016]  disk 1, o:1, dev:sdd
[Tue Aug  2 22:14:52 2016]  disk 2, o:1, dev:sde
[Tue Aug  2 22:14:52 2016]  disk 3, o:1, dev:sdf
[Tue Aug  2 22:14:52 2016]  disk 4, o:0, dev:sdg
[Tue Aug  2 22:14:52 2016]  disk 5, o:1, dev:sdh
[Tue Aug  2 22:14:52 2016] RAID conf printout:
[Tue Aug  2 22:14:52 2016]  --- level:6 rd:6 wd:5
[Tue Aug  2 22:14:52 2016]  disk 0, o:1, dev:sdc
[Tue Aug  2 22:14:52 2016]  disk 1, o:1, dev:sdd
[Tue Aug  2 22:14:52 2016]  disk 2, o:1, dev:sde
[Tue Aug  2 22:14:52 2016]  disk 3, o:1, dev:sdf
[Tue Aug  2 22:14:52 2016]  disk 5, o:1, dev:sdh

I assume that error "attempting task abort! scmd" lead to degraded on array, but doesn't know what cause it.

Update 20160806

I've tried set other server with the same specs. Without mdadm RAID, each disk is mounted directly under ext4 filesystem. After a while kernel log show "attempting task abort! scmd" on some disks. This lead /dev/sdd1 error then remount to read-only mode

$ dmesg -T
[Sat Aug  6 05:21:09 2016] sd 0:0:3:0: [sdd] CDB:
[Sat Aug  6 05:21:09 2016] Read(10): 28 00 2d 29 21 00 00 00 20 00
[Sat Aug  6 05:21:09 2016] scsi target0:0:3: handle(0x000a), sas_address(0x4433221103000000), phy(3)
[Sat Aug  6 05:21:09 2016] scsi target0:0:3: enclosure_logical_id(0x500304801a5d3f01), slot(3)
[Sat Aug  6 05:21:09 2016] sd 0:0:3:0: task abort: SUCCESS scmd(ffff88006b206800)
[Sat Aug  6 05:21:09 2016] sd 0:0:3:0: attempting task abort! scmd(ffff88019a3a07c0)
[Sat Aug  6 05:21:09 2016] sd 0:0:3:0: [sdd] CDB:
[Sat Aug  6 05:21:09 2016] Read(10): 28 00 08 46 8f 80 00 00 20 00
[Sat Aug  6 05:21:09 2016] scsi target0:0:3: handle(0x000a), sas_address(0x4433221103000000), phy(3)
[Sat Aug  6 05:21:09 2016] scsi target0:0:3: enclosure_logical_id(0x500304801a5d3f01), slot(3)
[Sat Aug  6 05:21:09 2016] sd 0:0:3:0: task abort: SUCCESS scmd(ffff88019a3a07c0)
[Sat Aug  6 05:21:10 2016] sd 0:0:3:0: attempting device reset! scmd(ffff880f9a49ac80)
[Sat Aug  6 05:21:10 2016] sd 0:0:3:0: [sdd] CDB:
[Sat Aug  6 05:21:10 2016] Synchronize Cache(10): 35 00 00 00 00 00 00 00 00 00
[Sat Aug  6 05:21:10 2016] scsi target0:0:3: handle(0x000a), sas_address(0x4433221103000000), phy(3)
[Sat Aug  6 05:21:10 2016] scsi target0:0:3: enclosure_logical_id(0x500304801a5d3f01), slot(3)
[Sat Aug  6 05:21:10 2016] sd 0:0:3:0: device reset: SUCCESS scmd(ffff880f9a49ac80)
[Sat Aug  6 05:21:10 2016] mpt3sas0: log_info(0x31110e03): originator(PL), code(0x11), sub_code(0x0e03)
[Sat Aug  6 05:21:10 2016] mpt3sas0: log_info(0x31110e03): originator(PL), code(0x11), sub_code(0x0e03)
[Sat Aug  6 05:21:10 2016] mpt3sas0: log_info(0x31110e03): originator(PL), code(0x11), sub_code(0x0e03)
[Sat Aug  6 05:21:11 2016] end_request: I/O error, dev sdd, sector 780443696
[Sat Aug  6 05:21:11 2016] Aborting journal on device sdd1-8.
[Sat Aug  6 05:21:11 2016] EXT4-fs error (device sdd1): ext4_journal_check_start:56: Detected aborted journal
[Sat Aug  6 05:21:11 2016] EXT4-fs (sdd1): Remounting filesystem read-only
[Sat Aug  6 05:40:35 2016] sd 0:0:5:0: attempting task abort! scmd(ffff88024fc08340)
[Sat Aug  6 05:40:35 2016] sd 0:0:5:0: [sdf] CDB:
[Sat Aug  6 05:40:35 2016] Synchronize Cache(10): 35 00 00 00 00 00 00 00 00 00
[Sat Aug  6 05:40:35 2016] scsi target0:0:5: handle(0x000c), sas_address(0x4433221105000000), phy(5)
[Sat Aug  6 05:40:35 2016] scsi target0:0:5: enclosure_logical_id(0x500304801a5d3f01), slot(5)
[Sat Aug  6 05:40:35 2016] sd 0:0:5:0: task abort: FAILED scmd(ffff88024fc08340)
[Sat Aug  6 05:40:35 2016] sd 0:0:5:0: attempting task abort! scmd(ffff88019a12ee00)
[Sat Aug  6 05:40:35 2016] sd 0:0:5:0: [sdf] CDB:
[Sat Aug  6 05:40:35 2016] Read(10): 28 00 27 c8 b4 e0 00 00 20 00
[Sat Aug  6 05:40:35 2016] scsi target0:0:5: handle(0x000c), sas_address(0x4433221105000000), phy(5)
[Sat Aug  6 05:40:35 2016] scsi target0:0:5: enclosure_logical_id(0x500304801a5d3f01), slot(5)
[Sat Aug  6 05:40:35 2016] sd 0:0:5:0: task abort: SUCCESS scmd(ffff88019a12ee00)
[Sat Aug  6 05:40:35 2016] sd 0:0:5:0: attempting task abort! scmd(ffff88203eaddac0)

Update 20160930

After the controller firmware was upgraded to latest version (currently) 12.00.02, the issue dissapeared

Conclusion

The issue is solved

junior_h
  • 41
  • 3
  • Is it always `/dev/sde` that drops out of the array? – bodgit Jul 19 '16 at 11:22
  • 1
    No, /dev/sde, /dev/sdf, /dev/sdg was out from array before. – junior_h Jul 19 '16 at 11:31
  • Hi junior_h. I edited your question mainly for grammar. I believe I stayed to your original intent, but if you disagree, feel free to either [edit] further or roll back the edit if you disagree strongly. – user Jul 19 '16 at 12:24
  • also, why'd you use partitions? If those aren't aligned properly that'll definitely cause problems. – ppetraki Jul 19 '16 at 13:45
  • I though it will be more easy when replace the disk with different size. I just check through $fdisk that all disks has same alignment (start=2048; end=1562824367; sectors=1562822320; size=745.2G; Id=83; Type=Linux) – junior_h Jul 20 '16 at 04:51
  • So it's advertising 512B sectors (1562822320 * 512 / 1024^3 = 745.2GiB) and and partition starts 1K from the beginning. However, those things like to write in 4K boundaries so that's really what you want your offset to be or just eliminate the damn partition altogether. Since it's SAS, you can use sg_format to set the advertised size of the block device. In addition to that, you could try over provisioning your disks, that is only allocate 80% of the storage of each disk to the RAID. – ppetraki Jul 20 '16 at 15:08
  • You might find these links helpful: https://www.research.ibm.com/haifa/conferences/systor2011/present/session5_talk2_systor2011.pdf SSD's don't like being full. http://www.iscsi.com/pdf/RAIDOptimization-Whitepaper.pdf , pp4-5 Finally, Linux filesystems can be made aware of the stripe size of your backing store at the time they are formatted. I think XFS auto detects but it's not something you want to leave to chance. – ppetraki Jul 20 '16 at 15:30
  • I goofed your partition offset. 2048 * 512 / 1024^2 = 1 MiB – ppetraki Jul 20 '16 at 16:21

2 Answers2

1

That's a pretty big stripe, 8-2=6 * 512K = 3MiB; Not an even one either. Bring your array to 10 disks (8 data + 2 parity) or down to 4 + 2 parity with a total stripe size of 256K or 64K per drive. It could be that the cache is mad at you for unaligned writes. You could try putting all the drives in write-through mode before you attempt to reconfigure the array.

Update 7/20/16.

At this point I'm convinced that your RAID configuration is the problem. A 3MiB stripe is just odd, even if it's a multiple of your partition offset [1] (1MiB) it's just a sub-optimal stripe size for any RAID, SSD or otherwise. It's probably generating tons of unaligned writes, which is forcing your SSD to free up more pages than it has readily available, which pushes it into the garbage collector constantly, and is shortening it's useful life. The drive just can't get free pages available fast enough for writes so when you finally flush the cache to disk (synchronize write), it literally fails. You do not have a crash consistent array e.g. your data is not safe.

That's my theory based on the available information and the time I can spend on it. You now have before you a "growth opportunity" to become a storage expert ;)

Start over. Don't use partitions. Set a system aside and build an array that has a total stripe size of 128K (little more conservative to start). In RAID 6 configuration of N total drives, only N-2 drives get the data at any one time and the remaining two store parity information. So if N=6, a 128K stripe would require 32K chunks. You should be able to see now why 8 is kinda an odd number to run a RAID 6.

Then run fio [2] against the "raw disk" in direct mode and beat on it until your confident it's solid. Next add the filesystem and inform it of the underlying stripe size (man mkfs.???). Run fio again but this time use files (or you'll destroy the filesystem) and confirm the array stays up.

I know this is a lot of "stuff", just start small, try and understand what it's doing, and keep at it. Tools like blktrace and iostat can help you understand how you're applications are writing which will inform you of the best stripe/chunk size to use.

  1. https://www.percona.com/blog/2011/06/09/aligning-io-on-a-hard-disk-raid-the-theory/

(my fio cheatsheet) 2. https://wiki.mikejung.biz/Benchmarking#Fio_Random_Write_and_Random_Read_Command_Line_Examples

ppetraki
  • 322
  • 2
  • 10
  • I have other servers with 10disks set up RAID6 and the issue still occurs. I don't know how to change chunk size, is it set when create raid volume / format filesystem (ext4) ? – junior_h Jul 20 '16 at 05:00
  • That's helpful information as we can rule out defective HW. I think what you have is a configuration problem. What I think is happening, and this is the best I can do without spending hours instrumenting your system, is that you're generating a lot of write amplification with the way your RAID is configured. Synchronize Cache just shouldn't fail. That there's no SCSI sense data is really concerning. What could be happening is the drive is busy doing garbage collection to free up space and simply can't accept the write or the drive went into powersave, essentially disk unavailable. – ppetraki Jul 20 '16 at 14:40
  • chunk size is set during mdadm build time, see man mdadm. – ppetraki Jul 20 '16 at 15:33
  • wrt to the 10 disk arrays. If the chunk size is the same, then you have a 4MiB stripe which just isn't helpful to most applications unless you're streaming video. Another possible theory for the stuffage is you're simply saturating the controller. A quick google says the LSI 3008 can do 1M 4K IOPs. Well my Micron M500 can do 80K 4K IOPS random writes on paper. Say you have 10 of those, if everything is working correctly you're using 80% of your IO controller capacity. If getting unaligned this or lots of RMW you could easily blow your IOPs budget which could lead to strange behavior. – ppetraki Jul 20 '16 at 20:02
  • ppetraki, thanks for explanation. The raid will be used by database (mysql/postrgesql). From $nmon command, total IO disks is still under control, 1000-2000 IOPS. As your suggestion, now I have options to reduce chunk, resize number of raid disk, and build array not in partition, then see if degraded whether the issue appears in next few days. – junior_h Jul 23 '16 at 19:00
  • @junior_h any updates? – ppetraki Aug 02 '16 at 18:44
  • ppetraki, I have update in the post (8/2/2016). Seems none of your suggestion solved the issue, start from change chunk size, reduce number of disk in array, use raw partition, until update LSI3008 driver & firmware. While I still curious abour error "attempting task abort! scmd" means. – junior_h Aug 04 '16 at 16:45
  • @junior_h then you have a HW problem. Where your disks themselves (firmware) can be the problem. At this point I would involve your vendor, if you put this together yourself then you'll need to contact LSI directly. It's really concerning that synchronize cache is failing. According to the spec, http://t10.org/ftp/t10/document.05/05-344r0.pdf, there's a very narrow range of error conditions. You're not even seeing those error messages as the command itself is timing out which made the LSI adapter abort it. I don't think your data is making it to the disk, at least not on a regular basis. – ppetraki Aug 04 '16 at 18:50
  • ppetraki, this issue has been raised to the supplier. I open case here to know whether there is something wrong with mdadm / LSI controller and let them know about the case. – junior_h Aug 08 '16 at 07:04
1

At start check and show SMART readings. I suspect that you disk is faulty. It looks like timeout after attempt to read/write faulty sectors. It maybe also cabling issue (loose contact, broken cable etc.). I see also disks with similar firmware issue. After SMART I should have more advices.

  • I don't think so, because all disks are new. I'm just updating about the case, now it's solved by upgrading its firmware after they release it. – junior_h Sep 29 '16 at 21:13