xfs corruption metadata after reboot

Question

I had a problem on a RAID1 with 4 disks. We replaced the faulty disk and restarted the server, the rebuild was done, two linux centos 7 machines did not come up accusing error of xfs corruption. Other machines rose normally. I tried to mount the partition:

# mount /dev/mapper/cs_mbox_opt /mnt
returned: XFS metadata corruption detected at xfs_dir3_leaf_check_init.....

I ran the XFS_repair command and received the message that it was not possible to fix and indicated to use -L. I did the process with xfs_repair -L and after many messages with errors it informed that it was not possible to correct with the message: Metadata CRC error detected at 0x559d9f7ac1e9. xfs_dir3_block 0x41df0c80/0x1000 corrupt block 0 in directory inode 807368306: junking block segmentation failure(saved core image)

I exported the metadata and imported it in another directory but I got the error:

Commands:
#xfs_metadump -gwa /dev/mapper/[volume] /tmp/xfsmetadata.img 
# xfs_mdrestore -g /tmp/xfsmetadata.img /tmp/xfs_file 
# xfs_repair -vf /tmp/xfs_file

Sorry, Could not file valid secondary superblock.
See attached images.

At the moment I don't know what else to do. Any tips?

I mentioned the steps above.

Did you really have a 4-way RAID1 (ie: four copies of your data)? Are you using hardware or software RAID? — shodanshok, Feb 15 '23 at 15:18
Hi. That's a raid1 with 4 disks. We exchanged the HDs and HP's Ilo reported that the rebuild was completed. — Christovam, Feb 17 '23 at 12:17
The log you provided shows real data and metadata error, so I don't think you can cleanly recover. I suggest you to recover from backups. But without detailed hardware and setup info it is not possible to help further. — shodanshok, Feb 17 '23 at 17:20
I already made a backup. But I would like to retrieve information from the day the problem occurred. What kind of information do you need to support? — Christovam, Feb 17 '23 at 20:02
Complementing. Raid 1 is configured on the HP P440 controller smart array. I have 4 disks of 600Gb. — Christovam, Feb 18 '23 at 12:37
Ok, the HP p440 is a real hardware RAID controller, so it should had no issue in replacing a failed disk. Are you sure it was a RAID1 array? If so, please check the server RAM via memtest86 (you can live-boot it). — shodanshok, Feb 18 '23 at 17:47

xfs corruption metadata after reboot

0 Answers0