0

I don't have a good understanding of COW-snapshots mechanics but expect they contain the diffs and shared data among all of those which have one parent subvolume.

I made a script to check btrfs snapshots disk space consumption.

#!/usr/bin/zsh

for i in {1..2000}
do
    echo 'line'$i >> /btrfs/test-volume/btrfs-doc.txt
    /usr/bin/time -f "execution time: %E" btrfs subvolume snapshot /btrfs/test-volume /btrfs/snapshots/test-volume-snap$i
done

After running i displayed their dirs size and what i got:

❯ btrfs filesystem df /btrfs
Data, single: total=8.00MiB, used=6.84MiB
System, DUP: total=8.00MiB, used=16.00KiB
Metadata, DUP: total=102.38MiB, used=33.39MiB
GlobalReserve, single: total=3.25MiB, used=0.00B

❯ btrfs filesystem du -s /btrfs
     Total   Exclusive  Set shared  Filename
  18.54MiB     6.74MiB    36.00KiB  /btrfs

❯ df -h /btrfs
Filesystem                      Size  Used Avail Use% Mounted on
/dev/mapper/vgstoragebox-btrfs  2.0G   77M  1.8G   5% /btrfs

❯ du -sh /btrfs
20M     /btrfs

❯ ll /btrfs/test-volume/btrfs-doc.txt
-rw-r--r-- 1 root root 17K Jul  6 14:50 /btrfs/test-volume/btrfs-doc.txt

❯ tree -hU /btrfs/snapshots
/btrfs/snapshots
├── [  26]  test-volume-snap1
│   └── [   6]  btrfs-doc.txt
├── [  26]  test-volume-snap2
│   └── [  12]  btrfs-doc.txt
├── [  26]  test-volume-snap3
│   └── [  18]  btrfs-doc.txt
...
├── [  26]  test-volume-snap1998
│   └── [ 16K]  btrfs-doc.txt
├── [  26]  test-volume-snap1999
│   └── [ 16K]  btrfs-doc.txt
└── [  26]  test-volume-snap2000
    └── [ 16K]  btrfs-doc.txt

2000 directories, 2000 files

All the utils calculated size differently, i can't say how much disk space /btrfs/snapshots dir consumed actually, but i see it's much bigger than at least a double size of the file /btrfs/test-volume/btrfs-doc.txt. At the moment i think it should be around the double size in case the btrfs snapshots contain the diffs and shared data is linking.

In comparison, i made the same test with LVM snapshots and small disk space was consumed by them.

kvdm.dev
  • 141
  • 1
  • 1
  • 12

1 Answers1

0

From a userland perspective btrfs snapshots are just simple directories containing the files and contents of the subvolume at the time the snapshot was created. You can access them normally like any other directory.

Therefore the userland tools you used will report the sizes of the individual files within the snapshot just as with any other file. If you create say 10 snapshots of the same subvolume the userland tools such as du will report the same total size for each snapshot, and summarizing this for all 10 snapshots will report a disk usage of 10 times the size of the initial subvolume.

But: due to the CoW-nature of these subvolumes the contained files within the snapshots actually all share the same data blocks on disk. So although du will report 10 times the total size it is only used up on disk once.


The way Copy-on-Write works is that a new copy of a file (e.g. with created cp --reflink) or new snapshot is at first nothing more than a new pointer to the same physical data on disk as the original file/subvolume. So the new file will not use any additional disk space (besides some additional metadata).

Only when the data is changed the new additional data is written to a new place on the disk an the pointer of the file/snapshot is updated to include that data block. All unchanged parts of the data are still shared with the original copy.

This is why creating snapshots is very fast and uses next to no additional disk space. But over time the disk space used by a snapshot may grow since its reference data blocks diverge from the original subvolume and less and less data block will actually be shared.


If you want to see the amount of data that is shared between/unique to the individual subvolumes you can use the quota support feature of btrfs.

acran
  • 7,070
  • 1
  • 18
  • 35
  • Thx, i pretty understand a theory about CoW and got confused coz couldn't display it with the commands i know. You mentioned `btrfs quota` so could give a basic example in your answer to demonstrate how it working? The ref link example isn't clear , i had a look at it. – kvdm.dev Aug 26 '22 at 08:19
  • Just follow the the wiki to [enable quotas](https://btrfs.wiki.kernel.org/index.php/Quota_support#Enabling_quota) - note: for existing filesystems this may take very long - and use `btrfs quota show` or for a nicer output one of the [scripts from the wiki](https://btrfs.wiki.kernel.org/index.php/Quota_support#.2Fusr.2Flocal.2Fbin.2FbtrfsQuota.sh) page to view the shared/unique disk usages. – acran Aug 29 '22 at 11:07