What is the safe way to run fsck or badblocks on a XenServer local storage repository?

Question

I have a XenServer host with six large local disks as storage repositories, which are mounted in pairs in three VMs, and NFS shared from the VMs to provide backup volumes for multiple hypervisors of different types.

There is no RAID involved on the big disks, each of the three VMs boots from a separate area of a smaller, RAIDed boot storage, and then serves its two large disks by NFS. It's a basic system but mostly works.

Recently one of the disks, a fairly recently installed 18TB WD Gold, keeps going read-only within its VM, causing backups to fail.

So before doing anything else, I want to thoroughly check it using fsck and badblocks, from the XenServer host. I've already run smartctl -H and that shows a pass.

Because of how Xen handles its storage repositories the disks aren't mounted in the traditional manner, I can only see them via fdisk, but given they are in use by Xen itself I'm assuming I can't just go running fsck on them.

I've tried googling various versions of "how to fsck a Xen local repository" but have not found anything directly related to my situation.

One option under consideration is to abolish Xen and the VMs and just reinstall the machine as a straight Ubuntu host and share the disks directly from it, but when it was first set up there was a need to have an additional Xen host available as an emergency fallback for other systems, hence the system being as it is.

Is there a safe way to check the errant disk?

Update: While not a solution to this question, when we extracted the disk from the server, we found that the SATA power pins on the disk and the Molex -> 2 SATA cable powering it were very heavily corroded on some of the pins. The machine in question lives in a proper air-conditioned datacentre and so should not be subject to damp or other corrosion-causing situations, so we're wondering if the underlying problem causing the disk to go read-only was related to unstable power from the corroded contacts. After cleaning contacts, the disk worked OK when installed in a Windows machine at home. — Pyromancer, Nov 04 '22 at 15:59

What is the safe way to run fsck or badblocks on a XenServer local storage repository?

0 Answers0