Linux MD RAID with btrfs stuck and consume 100 % CPU

Question

The issue

During last several weeks, I am experiencing annoying issue on my physical server with Linux software RAID 6 (md, mdadm) with btrfs file system on it.

Once in several days (irregularly, as I noticed), the md1_raid6 process starts to consume 100 % of one CPU core and during that time, all file system access on btrfs on top of this raid device get stucked (user space processes hangs in disk sleep state).

In most cases, after several "IO" actions like listing files (ls), accessing btrfs information (btrfs filesystem, btrfs subvolume), or accessing the device (dd and so), the file system gets magically unstucked and md1_raid6 process released from its "live lock" (or whatever it is cycled in).

The worse case happens sometimes, when I am not successful with this "magic unstucking". Then I am not able to even kill the processes stucked in disk sleep state and I am forced to reset the system.

When my issue happens, I found very often similar messages in kernel dmesg log:

INFO: task md1_reclaim:910 blocked for more than 120 seconds.

with included call trace.

However, there are some more "blocked" tasks, like btrfs and btrfs-transaction with call trace also included.

Questions

What should be the cause of this problem?
What should I do to mitigate this issue?
Could it be a hardware problem? How can I track this?

What I have done so far

I keep the system up-to-date, with latest stable kernel provided by Debian packages
I run both btrfs scrub and fsck.btrfs to exclude btrfs file system issue.
I have read all the physical disks (with dd command) and perform SMART self tests to exclude disks issue (though read/write badblocks were not checked yet).
I have also moved all the files out of affected file system, create a new btrfs file system (with recent btrfs-progs) and moved files back. This issue yet appeared again.
I have tried to attach strace to cycled md1 process, but unsuccessfully (is it even possible to strace running kernel thread?)
Of course, I have tried to find similar issues around the web, but I was not successful.

Some detailed facts

OS information

Debian 10 buster (stable release)
Linux kernel 5.5 (from Debian backports)

Hardware information

8 core/16 threads amd64 processor (AMD EPYC 7251)
6 SATA HDD disks (Seagate Enterprise Capacity)
2 SSD disks (Intel D3-S4610)

RAID information

All Linux MD RAID (no HW RAID used)
RAID1 over 2 SSD (md0)
- On top of this RAID1 is a LVM with logical volumes for:
  - root file system
  - swap
  - RAID6 write journal
RAID6 over 6 HDD and 1 LV as write journal (md1)
- This is the affected device

Usage information

Affected RAID6 block device is directly formatted to btrfs
This file system is used to store backups
Backups are performed via rsnapshot
rsnapshot is configured to use btrfs snapshots for hourly and daily backups and rsync to copy new backups

Other IO operations

There is ongoing monitoring through Icinga which uses iostat internally to monitor I/O
SMART selftests run periodically (on weekly and monthly basis)

I guess if the kernel reports a blocked kernel thread then there is a bug in any case. I suggest you report your case to the MD mailing list. — Hauke Laging, Jun 03 '20 at 23:37
Ok, thanks @HaukeLaging. I am not sure whether it is MD or BTRFS issue at the moment. Also, I want to eliminate my misuse of these technologies first. — Vojta Myslivec, Jun 04 '20 at 10:09

Linux MD RAID with btrfs stuck and consume 100 % CPU

The issue

Questions

What I have done so far

Some detailed facts

0 Answers0