1

We have a storage machine (file server) with a LSI Megaraid 9260-16i RAID system, and 15 hard drives configured as RAID5 plus one off-lined hard drive which is going to be replaced. Each hard drive is a 4TB SATA3 disk. Recently we found the following the same "Patrol Read" error event logs happened in 3 of the on-line hard drives in the RAID5:

Code: 0x0000005f
Class: 3
Locale: 0x02
Event Description: Patrol Read found an uncorrectable medium error on PD 1b(e0xf5/s15) at e0377d38
Event Data:
Device ID: 27
Enclosure Index: 245
Slot Number: 15
LBA: 3761732920

Code: 0x0000005f
Class: 3
Locale: 0x02
Event Description: Patrol Read found an uncorrectable medium error on PD 11(e0xf5/s5) at e0377d38
Event Data:
Device ID: 17
Enclosure Index: 245
Slot Number: 5
LBA: 3761732920

Code: 0x0000005f
Class: 3
Locale: 0x02
Event Description: Patrol Read found an uncorrectable medium error on PD 15(e0xf5/s12) at e0377d38
Event Data:
Device ID: 21
Enclosure Index: 245
Slot Number: 12
LBA: 3761732920

It is quite strange that the uncorrectable medium error appeared exactly at the same address e0377d38 in three hard drives (e0xf5/s15), (e0xf5/s5), and (e0xf5/s12). It caused the RAID card automatically start background initialization again and again. From the event log it seems that the RAID card tried to fix this error, but always failed. Hence I temporarily abort the background initialization.

We are not quite sure why this problem was raised. Recently we replaced two of the above three hard disks, i.e., (e0xf5/s5) and (e0xf5/s12), due to the original disks broken about 3 weeks ago. It seems that after the hard disk replacement the problem came up.

Could anyone suggest me how to fix this problem ? and what's the consequence in case that it cannot be fixed ?

In the end the list of our -AdpAllInfo, -LDInfo, and -PDList of those three hard drivers are attached for further information. Thank you very much for your kindly help.

T.H.Hsieh

-AdpAllInfo

Adapter #0

==============================================================================
                    Versions
                ================
Product Name    : LSI MegaRAID SAS 9260-16i
Serial No       : SV21116679
FW Package Build: 12.9.0-0038

                    Mfg. Data
                ================
Mfg. Date       : 03/17/12
Rework Date     : 00/00/00
Revision No     : 20C
Battery FRU     : N/A

                Image Versions in Flash:
                ================
BIOS Version       : 3.18.00_4.09.05.00_0x0416A000
FW Version         : 2.90.03-0933
Preboot CLI Version: 04.04-010:#%00008
WebBIOS Version    : 6.0-18-e_13-Rel
NVDATA Version     : 2.06.03-0010
Boot Block Version : 2.02.00.00-0000
BOOT Version       : 01.250.04.219

                Pending Images in Flash
                ================
None

                PCI Info
                ================
Controller Id   : 0000
Vendor Id       : 1000
Device Id       : 0079
SubVendorId     : 1000
SubDeviceId     : 9276

Host Interface  : PCIE

Number of Frontend Port: 0 
Device Interface  : PCIE

Number of Backend Port: 8 
Port  :  Address
0        500062b20037c5ff 
1        0000000000000000 
2        0000000000000000 
3        0000000000000000 
4        0000000000000000 
5        0000000000000000 
6        0000000000000000 
7        0000000000000000 

                HW Configuration
                ================
SAS Address      : 500062b20037c5c0
BBU              : Absent
Alarm            : Present
NVRAM            : Present
Serial Debugger  : Present
Memory           : Present
Flash            : Present
Memory Size      : 512MB
TPM              : Absent
On board Expander: Present
Upgrade Key      : Absent
Temperature sensor for ROC    : Absent
Temperature sensor for controller    : Absent

On board Expander FW version : 25.05.04.00

                Settings
                ================
Current Time                     : 17:15:57 3/3, 2021
Predictive Fail Poll Interval    : 300sec
Interrupt Throttle Active Count  : 16
Interrupt Throttle Completion    : 50us
Rebuild Rate                     : 30%
PR Rate                          : 30%
BGI Rate                         : 30%
Check Consistency Rate           : 30%
Reconstruction Rate              : 30%
Cache Flush Interval             : 4s
Max Drives to Spinup at One Time : 24
Delay Among Spinup Groups        : 2s
Physical Drive Coercion Mode     : Disabled
Cluster Mode                     : Disabled
Alarm                            : Disabled
Auto Rebuild                     : Enabled
Battery Warning                  : Disabled
Ecc Bucket Size                  : 15
Ecc Bucket Leak Rate             : 1440 Minutes
Restore HotSpare on Insertion    : Disabled
Expose Enclosure Devices         : Enabled
Maintain PD Fail History         : Enabled
Host Request Reordering          : Enabled
Auto Detect BackPlane Enabled    : SGPIO/i2c SEP
Load Balance Mode                : Auto
Use FDE Only                     : No
Security Key Assigned            : No
Security Key Failed              : No
Security Key Not Backedup        : No
Default LD PowerSave Policy      : Controller Defined
Maximum number of direct attached drives to spin up in 1 min : 0 
Auto Enhanced Import             : No
Any Offline VD Cache Preserved   : No
Allow Boot with Preserved Cache  : No
Disable Online Controller Reset  : No
PFK in NVRAM                     : No
Use disk activity for locate     : No
POST delay                       : 90 seconds

                Capabilities
                ================
RAID Level Supported             : RAID0, RAID1, RAID5, RAID6, RAID00, RAID10, RAID50, RAID60, PRL 11, PRL 11 with spanning, SRL 3 supported, PRL11-RLQ0 DDF layout with no span, PRL11-RLQ0 DDF layout with span
Supported Drives                 : SAS, SATA

Allowed Mixing:

Mix in Enclosure Allowed
Mix of SAS/SATA of HDD type in VD Allowed

                Status
                ================
ECC Bucket Count                 : 0

                Limitations
                ================
Max Arms Per VD          : 32 
Max Spans Per VD         : 8 
Max Arrays               : 128 
Max Number of VDs        : 64 
Max Parallel Commands    : 1008 
Max SGE Count            : 60 
Max Data Transfer Size   : 8192 sectors 
Max Strips PerIO         : 42 
Max LD per array         : 16 
Min Strip Size           : 8 KB
Max Strip Size           : 1.0 MB
Max Configurable CacheCade Size: 0 GB
Current Size of CacheCade      : 0 GB
Current Size of FW Cache       : 431 MB

                Device Present
                ================
Virtual Drives    : 1 
  Degraded        : 0 
  Offline         : 0 
Physical Devices  : 17 
  Disks           : 16 
  Critical Disks  : 0 
  Failed Disks    : 0 

                Supported Adapter Operations
                ================
Rebuild Rate                    : Yes
CC Rate                         : Yes
BGI Rate                        : Yes
Reconstruct Rate                : Yes
Patrol Read Rate                : Yes
Alarm Control                   : Yes
Cluster Support                 : No
BBU                             : Yes
Spanning                        : Yes
Dedicated Hot Spare             : Yes
Revertible Hot Spares           : Yes
Foreign Config Import           : Yes
Self Diagnostic                 : Yes
Allow Mixed Redundancy on Array : No
Global Hot Spares               : Yes
Deny SCSI Passthrough           : No
Deny SMP Passthrough            : No
Deny STP Passthrough            : No
Support Security                : No
Snapshot Enabled                : No
Support the OCE without adding drives : Yes
Support PFK                     : No
Support PI                      : No
Support Boot Time PFK Change    : No
Disable Online PFK Change       : No
Support Shield State            : No
Block SSD Write Disk Cache Change: No

                Supported VD Operations
                ================
Read Policy          : Yes
Write Policy         : Yes
IO Policy            : Yes
Access Policy        : Yes
Disk Cache Policy    : Yes
Reconstruction       : Yes
Deny Locate          : No
Deny CC              : No
Allow Ctrl Encryption: No
Enable LDBBM         : No
Support Breakmirror  : No
Power Savings        : No

                Supported PD Operations
                ================
Force Online                            : Yes
Force Offline                           : Yes
Force Rebuild                           : Yes
Deny Force Failed                       : No
Deny Force Good/Bad                     : No
Deny Missing Replace                    : No
Deny Clear                              : No
Deny Locate                             : No
Support Temperature                     : No
Disable Copyback                        : No
Enable JBOD                             : No
Enable Copyback on SMART                : No
Enable Copyback to SSD on SMART Error   : Yes
Enable SSD Patrol Read                  : No
PR Correct Unconfigured Areas           : Yes
Enable Spin Down of UnConfigured Drives : Yes
Disable Spin Down of hot spares         : Yes
Spin Down time                          : 30 
T10 Power State                         : No
                Error Counters
                ================
Memory Correctable Errors   : 0 
Memory Uncorrectable Errors : 0 

                Cluster Information
                ================
Cluster Permitted     : No
Cluster Active        : No

                Default Settings
                ================
Phy Polarity                     : 0 
Phy PolaritySplit                : 0 
Background Rate                  : 30 
Strip Size                       : 64kB
Flush Time                       : 4 seconds
Write Policy                     : WB
Read Policy                      : Adaptive
Cache When BBU Bad               : Disabled
Cached IO                        : No
SMART Mode                       : Mode 6
Alarm Disable                    : Yes
Coercion Mode                    : None
ZCR Config                       : Unknown
Dirty LED Shows Drive Activity   : No
BIOS Continue on Error           : No
Spin Down Mode                   : None
Allowed Device Type              : SAS/SATA Mix
Allow Mix in Enclosure           : Yes
Allow HDD SAS/SATA Mix in VD     : Yes
Allow SSD SAS/SATA Mix in VD     : No
Allow HDD/SSD Mix in VD          : No
Allow SATA in Cluster            : No
Max Chained Enclosures           : 16 
Disable Ctrl-R                   : Yes
Enable Web BIOS                  : Yes
Direct PD Mapping                : No
BIOS Enumerate VDs               : Yes
Restore Hot Spare on Insertion   : No
Expose Enclosure Devices         : Yes
Maintain PD Fail History         : Yes
Disable Puncturing               : No
Zero Based Enclosure Enumeration : No
PreBoot CLI Enabled              : Yes
LED Show Drive Activity          : Yes
Cluster Disable                  : Yes
SAS Disable                      : No
Auto Detect BackPlane Enable     : SGPIO/i2c SEP
Use FDE Only                     : No
Enable Led Header                : No
Delay during POST                : 0 
EnableCrashDump                  : No
Disable Online Controller Reset  : No
EnableLDBBM                      : No
Un-Certified Hard Disk Drives    : Allow
Treat Single span R1E as R10     : No
Max LD per array                 : 16
Power Saving option              : All power saving options are enabled
Default spin down time in minutes: 30 
Enable JBOD                      : No
TTY Log In Flash                 : No
Auto Enhanced Import             : No
BreakMirror RAID Support         : No
Disable Join Mirror              : No
Enable Shield State              : No
Time taken to detect CME         : 60s

Exit Code: 0x00

-LDInfo

Adapter 0 -- Virtual Drive Information:
Virtual Drive: 0 (Target Id: 0)
Name                :
RAID Level          : Primary-5, Secondary-0, RAID Level Qualifier-3
Size                : 50.934 TB
Parity Size         : 3.637 TB
State               : Optimal
Strip Size          : 64 KB
Number Of Drives    : 15
Span Depth          : 1
Default Cache Policy: WriteThrough, ReadAdaptive, Direct, No Write Cache if Bad BBU
Current Cache Policy: WriteThrough, ReadAdaptive, Direct, No Write Cache if Bad BBU
Default Access Policy: Read/Write
Current Access Policy: Read/Write
Disk Cache Policy   : Disk's Default
Encryption Type     : None
Is VD Cached: No


Number of Dedicated Hot Spares: 1
    0 : EnclId - 245 SlotId - 15 

Exit Code: 0x00

-PDList

Enclosure Device ID: 245
Slot Number: 5
Drive's postion: DiskGroup: 0, Span: 0, Arm: 5
Enclosure position: 0
Device Id: 17
WWN: 5000CCA25DCF7521
Sequence Number: 2
Media Error Count: 0
Other Error Count: 0
Predictive Failure Count: 0
Last Predictive Failure Event Seq Number: 0
PD Type: SATA
Raw Size: 3.638 TB [0x1d1c0beb0 Sectors]
Non Coerced Size: 3.637 TB [0x1d1b0beb0 Sectors]
Coerced Size: 3.637 TB [0x1d1b00000 Sectors]
Firmware state: Online, Spun Up
Device Firmware Level: 1M02
Shield Counter: 0
Successful diagnostics completion on :  N/A
SAS Address(0): 0x500062b20037c5cd
Connected Port Number: 0(path0) 
Inquiry Data: K4H3047B            WDC WD4002FYYZ-01B7CB0                  01.01M02
FDE Enable: Disable
Secured: Unsecured
Locked: Unlocked
Needs EKM Attention: No
Foreign State: None 
Device Speed: 6.0Gb/s 
Link Speed: 6.0Gb/s 
Media Type: Hard Disk Device
Drive Temperature : N/A
PI Eligibility:  No 
Drive is formatted for PI information:  No
PI: No PI
Drive's write cache : Disabled
Drive's NCQ setting : Disabled
Port-0 :
Port status: Active
Port's Linkspeed: 6.0Gb/s 
Drive has flagged a S.M.A.R.T alert : No

Enclosure Device ID: 245
Slot Number: 12
Drive's postion: DiskGroup: 0, Span: 0, Arm: 12
Enclosure position: 0
Device Id: 21
WWN: 5000CCA244CAFCD1
Sequence Number: 2
Media Error Count: 0
Other Error Count: 0
Predictive Failure Count: 0
Last Predictive Failure Event Seq Number: 0
PD Type: SATA
Raw Size: 3.638 TB [0x1d1c0beb0 Sectors]
Non Coerced Size: 3.637 TB [0x1d1b0beb0 Sectors]
Coerced Size: 3.637 TB [0x1d1b00000 Sectors]
Firmware state: Online, Spun Up
Device Firmware Level: T907
Shield Counter: 0
Successful diagnostics completion on :  N/A
SAS Address(0): 0x500062b20037c5d8
Connected Port Number: 0(path0) 
Inquiry Data: N8GT59EY            HGST HUS726040ALE610                    APGNT907
FDE Enable: Disable
Secured: Unsecured
Locked: Unlocked
Needs EKM Attention: No
Foreign State: None 
Device Speed: 6.0Gb/s 
Link Speed: 6.0Gb/s 
Media Type: Hard Disk Device
Drive Temperature : N/A
PI Eligibility:  No 
Drive is formatted for PI information:  No
PI: No PI
Drive's write cache : Disabled
Drive's NCQ setting : Disabled
Port-0 :
Port status: Active
Port's Linkspeed: 6.0Gb/s 
Drive has flagged a S.M.A.R.T alert : No

Enclosure Device ID: 245
Slot Number: 15
Enclosure position: 0
Device Id: 27
WWN: 5000CCA25DCE7105
Sequence Number: 5
Media Error Count: 0
Other Error Count: 0
Predictive Failure Count: 0
Last Predictive Failure Event Seq Number: 0
PD Type: SATAHotspare Information: 
Type: Dedicated, is revertible
Array #: 0

Raw Size: 3.638 TB [0x1d1c0beb0 Sectors]
Non Coerced Size: 3.637 TB [0x1d1b0beb0 Sectors]
Coerced Size: 3.637 TB [0x1d1b00000 Sectors]
Firmware state: Hotspare, Spun Up
Device Firmware Level: 1M02
Shield Counter: 0
Successful diagnostics completion on :  N/A
SAS Address(0): 0x500062b20037c5db
Connected Port Number: 0(path0) 
Inquiry Data: K4H0SV7B            WDC WD4002FYYZ-01B7CB0                  01.01M02Hotspare Information: 
Type: Dedicated, is revertible
Array #: 0

FDE Enable: Disable
Secured: Unsecured
Locked: Unlocked
Needs EKM Attention: No
Foreign State: None 
Device Speed: 6.0Gb/s 
Link Speed: 6.0Gb/s 
Media Type: Hard Disk Device
Drive Temperature : N/A
PI Eligibility:  No 
Drive is formatted for PI information:  No
PI: No PI
Drive's write cache : Disabled
Drive's NCQ setting : Disabled
Port-0 :
Port status: Active
Port's Linkspeed: 6.0Gb/s 
Drive has flagged a S.M.A.R.T alert : No


Hotspare Information: 
Type: Dedicated, is revertible
Array #: 0
  • After some research, it looks like that we need to backup all the data of the RAID5, do the full initialization of the "virtual drive", i.e., the whole RAID5, with "megacli -LDInit -Full -L0 -a0". This operation will clear out all the existing data. Then rebuild the RAID and store back the data. Unfortunately right now we have to extra disk space to do the full backup of the 56TB data. So this test will be pending to the future. – Tung-Han Hsieh Mar 09 '21 at 02:41

0 Answers0