3

In this article there is a nice recipe for how to use a RAM-disk as cache-device for a classical LVM volume.

Assumed you have an elder disk, lots of RAM and no SSD, you can boost disk performance to native RAM-throughput using this technique.

So I did this on an LVM which is used for my virtual VM running Windows 10. Voila: Disk-throughput was 4 times faster within the VM (average throughput, best used while patching Windows).

All was well - until I shutdown my linux-system (CentOS 7).

Data Loss!

The shutdown will not disassemble that cache. The same would be true in a power-failure-situation (yes - there will be data loss).

However - there has to be a way to recover what is left. But LVM will not let you operate on a VG with missing disks.

So - is there a receipe for this case out there?

Like

  • recover missing LVM cache disk with a new disk
  • force clean state
  • access cached LV again

In the last step one would make filesystem repairs and would recover missing/corrupted files from backup (using rsync).

Nils
  • 7,695
  • 3
  • 34
  • 73
  • 2
    There are definitely bugs in lvm recovering from a cache volume lost, or using RAM for cache. I ended up taking the /etc/lvm/backup/lvmgroupname, modifying it manually to remove all the missing cache, and then renaming the _corig volume back and adding the "VISIBLE" flag then using vgcfgrestore to get it back. – Brain2000 Jun 21 '20 at 21:19

1 Answers1

4

you can boost disk performance to native RAM-throughput using this technique

No, not quite. Once the cache is full of writes, write throughput degrades to what the underlying disk can do. And the first reads from the backing disk are still slow. I/O to the cache, yes is faster.


This method is very risky because it is very much a not persistent disk. I'm skeptical of the utility when you can just get a fast solid state disk or hardware write cache, but failure modes are fun to test.

THIS WILL CAUSE DATA LOSS. ONLY PROCEED IF YOU HAVE A BACKUP OF YOUR DATA.

First, without cache.

# Create volume
pvcreate /dev/sdb
vgcreate vg /dev/sdb
lvcreate --size 400g --name lv vg
mkfs.xfs /dev/vg/lv
mkdir /mnt/lv
mount /dev/vg/lv /mnt/lv
# Write test
dd bs=1M if=/dev/zero of=/mnt/lv/zero count=10000

Add cache.

# Create a RAM disk
modprobe brd rd_nr=1 rd_size=4585760 max_part=0
pvcreate /dev/ram0
vgextend vg /dev/ram0
# Create a cache
lvcreate -L 300M -n cache_meta vg /dev/ram0
lvcreate -L 4G -n cache_vol vg /dev/ram0
lvconvert –type cache-pool –poolmetadata vg/cache_meta –cachemode=writeback vg/cache_vol -y
# Add cache to a LV
lvconvert –type cache –cachepool vg/cache_vol vg/lv
# Write test
dd bs=1M if=/dev/zero of=/mnt/lv/zero2 count=10000
# Crash test
echo 'c' > /proc/sysrq-trigger

When it comes back, LVM is very unhappy, the volume is unaccessible.

[root@sf ~]# lvs
  WARNING: Device for PV YpvOB5-PZLO-POFL-3Cf4-G1IB-gep8-6eU10R not found or rejected by a filter.
  LV   VG Attr       LSize   Pool        Origin     Data%  Meta%  Move Log Cpy%Sync Convert
  lv   vg Cwi---C-p- 400.00g [cache_vol] [lv_corig]
[root@sf ~]# mount /dev/vg/lv /mnt/lv/
mount: special device /dev/vg/lv does not exist
[root@sf ~]# pvs
  WARNING: Device for PV YpvOB5-PZLO-POFL-3Cf4-G1IB-gep8-6eU10R not found or rejected by a filter.
  PV         VG Fmt  Attr PSize    PFree
  /dev/sdb   vg lvm2 a--  <500.00g 99.70g
  [unknown]  vg lvm2 a-m     4.37g 80.00m

You can't even force uncache it because the metadata has an I/O error.

[root@sf ~]# lvconvert --uncache vg/lv --force -y
  WARNING: Device for PV YpvOB5-PZLO-POFL-3Cf4-G1IB-gep8-6eU10R not found or rejected by a filter.
  WARNING: Cache pool data logical volume vg/cache_vol_cdata is missing.
  WARNING: Cache pool metadata logical volume vg/cache_vol_cmeta is missing.
  WARNING: Uncaching of partially missing writethrough cache volume vg/lv might destroy your data.
  /dev/mapper/vg-cache_vol_cmeta: read failed: Input/output error
  Failed to active cache locally vg/lv.

But you can force the data loss by creating a new PV with the same UUID. Uncache it to remove the ramdisk PV that LVM thinks still has data in it, but was lost. You could re-add the new cache with lvconvert, but I am not going to after the results of this experiment.

pvcreate --norestore --uuid YpvOB5-PZLO-POFL-3Cf4-G1IB-gep8-6eU10R  /dev/ram0
lvconvert --uncache vg/lv

Finally, check for any file system damage. Restore from backup is required to get your data back into a good state.

xfs_repair /dev/vg/lv

Edit: adding an empty PV in with the same UUID back in seems super hacky. lvconvert refused to uncache it, after all. If instead you put the metadata LV on permanent disk, then it can be cleaned up a little easier.

# Same procedure but meta is on persistent storage.
lvcreate -L 300M -n cache_meta vg /dev/sdb

It can be forced to uncache. Don't let the "Flushing 0 blocks" output reassure you, in-flight writes were already lost. The missing ramdisk can then be removed, making the VG consistent again.

[root@sf ~]# lvconvert --uncache vg/lv --force -y
  WARNING: Device for PV YpvOB5-PZLO-POFL-3Cf4-G1IB-gep8-6eU10R not found or rejected by a filter.
  WARNING: Cache pool data logical volume vg/cache_vol_cdata is missing.
  WARNING: Uncaching of partially missing writethrough cache volume vg/lv might destroy your data.
  Flushing 0 blocks for cache vg/lv.
  Logical volume "cache_vol" successfully removed
  Logical volume vg/lv is not cached.
[root@sf ~]# vgreduce --removemissing vg
  WARNING: Device for PV YpvOB5-PZLO-POFL-3Cf4-G1IB-gep8-6eU10R not found or rejected by a filter.
  Wrote out consistent volume group vg.
[root@sf ~]# pvs
  PV         VG Fmt  Attr PSize    PFree
  /dev/sdb   vg lvm2 a--  <500.00g <100.00g
John Mahowald
  • 32,050
  • 2
  • 19
  • 34
  • 1
    Edited to add an alternative where meta is on persistent storage. Slightly easier to clean up, still a performance boost. The big warning is for anyone who might find this in the future. It is like the local SSDs you may find on some cloud instances. Nice and fast, but heed the warning that data won't survive a reboot. – John Mahowald Sep 23 '18 at 15:45
  • Hi, my system crashed today (halt) with kernel 3.10.0-1062.4.1.el7.x86_64 while setting up the cache. I recovered after hard reset using your method. No data loss - but I had to use `lvchange -a y vg` in between. Why did you not need that for uncaching? – Nils Oct 27 '19 at 21:17
  • Do not use this procedure, it is an exercise in operational hassle and data loss. I think in the end I only successfully uncached in the scenario where the poolmetadata LV was on persistent storage. The details of your configuration and what happened could be another question. – John Mahowald Feb 13 '20 at 02:07