you can boost disk performance to native RAM-throughput using this
technique
No, not quite. Once the cache is full of writes, write throughput degrades to what the underlying disk can do. And the first reads from the backing disk are still slow. I/O to the cache, yes is faster.
This method is very risky because it is very much a not persistent disk. I'm skeptical of the utility when you can just get a fast solid state disk or hardware write cache, but failure modes are fun to test.
THIS WILL CAUSE DATA LOSS. ONLY PROCEED IF YOU HAVE A BACKUP OF YOUR DATA.
First, without cache.
# Create volume
pvcreate /dev/sdb
vgcreate vg /dev/sdb
lvcreate --size 400g --name lv vg
mkfs.xfs /dev/vg/lv
mkdir /mnt/lv
mount /dev/vg/lv /mnt/lv
# Write test
dd bs=1M if=/dev/zero of=/mnt/lv/zero count=10000
Add cache.
# Create a RAM disk
modprobe brd rd_nr=1 rd_size=4585760 max_part=0
pvcreate /dev/ram0
vgextend vg /dev/ram0
# Create a cache
lvcreate -L 300M -n cache_meta vg /dev/ram0
lvcreate -L 4G -n cache_vol vg /dev/ram0
lvconvert –type cache-pool –poolmetadata vg/cache_meta –cachemode=writeback vg/cache_vol -y
# Add cache to a LV
lvconvert –type cache –cachepool vg/cache_vol vg/lv
# Write test
dd bs=1M if=/dev/zero of=/mnt/lv/zero2 count=10000
# Crash test
echo 'c' > /proc/sysrq-trigger
When it comes back, LVM is very unhappy, the volume is unaccessible.
[root@sf ~]# lvs
WARNING: Device for PV YpvOB5-PZLO-POFL-3Cf4-G1IB-gep8-6eU10R not found or rejected by a filter.
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
lv vg Cwi---C-p- 400.00g [cache_vol] [lv_corig]
[root@sf ~]# mount /dev/vg/lv /mnt/lv/
mount: special device /dev/vg/lv does not exist
[root@sf ~]# pvs
WARNING: Device for PV YpvOB5-PZLO-POFL-3Cf4-G1IB-gep8-6eU10R not found or rejected by a filter.
PV VG Fmt Attr PSize PFree
/dev/sdb vg lvm2 a-- <500.00g 99.70g
[unknown] vg lvm2 a-m 4.37g 80.00m
You can't even force uncache it because the metadata has an I/O error.
[root@sf ~]# lvconvert --uncache vg/lv --force -y
WARNING: Device for PV YpvOB5-PZLO-POFL-3Cf4-G1IB-gep8-6eU10R not found or rejected by a filter.
WARNING: Cache pool data logical volume vg/cache_vol_cdata is missing.
WARNING: Cache pool metadata logical volume vg/cache_vol_cmeta is missing.
WARNING: Uncaching of partially missing writethrough cache volume vg/lv might destroy your data.
/dev/mapper/vg-cache_vol_cmeta: read failed: Input/output error
Failed to active cache locally vg/lv.
But you can force the data loss by creating a new PV with the same UUID. Uncache it to remove the ramdisk PV that LVM thinks still has data in it, but was lost. You could re-add the new cache with lvconvert
, but I am not going to after the results of this experiment.
pvcreate --norestore --uuid YpvOB5-PZLO-POFL-3Cf4-G1IB-gep8-6eU10R /dev/ram0
lvconvert --uncache vg/lv
Finally, check for any file system damage. Restore from backup is required to get your data back into a good state.
xfs_repair /dev/vg/lv
Edit: adding an empty PV in with the same UUID back in seems super hacky. lvconvert refused to uncache it, after all. If instead you put the metadata LV on permanent disk, then it can be cleaned up a little easier.
# Same procedure but meta is on persistent storage.
lvcreate -L 300M -n cache_meta vg /dev/sdb
It can be forced to uncache. Don't let the "Flushing 0 blocks" output reassure you, in-flight writes were already lost. The missing ramdisk can then be removed, making the VG consistent again.
[root@sf ~]# lvconvert --uncache vg/lv --force -y
WARNING: Device for PV YpvOB5-PZLO-POFL-3Cf4-G1IB-gep8-6eU10R not found or rejected by a filter.
WARNING: Cache pool data logical volume vg/cache_vol_cdata is missing.
WARNING: Uncaching of partially missing writethrough cache volume vg/lv might destroy your data.
Flushing 0 blocks for cache vg/lv.
Logical volume "cache_vol" successfully removed
Logical volume vg/lv is not cached.
[root@sf ~]# vgreduce --removemissing vg
WARNING: Device for PV YpvOB5-PZLO-POFL-3Cf4-G1IB-gep8-6eU10R not found or rejected by a filter.
Wrote out consistent volume group vg.
[root@sf ~]# pvs
PV VG Fmt Attr PSize PFree
/dev/sdb vg lvm2 a-- <500.00g <100.00g