2

I am trying to understand how to scale my Prometheus and look at the storage mechanic for this.

Lets assume the following:

  1. Prometheus data storage Directory: 20GB in size
  2. Snapshots amount: 3
  3. Snapshot Size: 18GB each

Question: Without symlinks, how is the sum of each snapshot size larger than the total size of the directory? How is it ensured that taking a Snapshot will contain all the data required?

I assume that the storage mechanism of Prometheus will store references instead of real data. But what system is exactly at work here, I tried to find out the mechanic behind this.

Pointers in the right direction are welcome as well. I would like to understand the principles at least.

Panade
  • 121
  • 1
  • Came to find out the same thing :( Any hints people of the internet? – Nour Wolf Mar 09 '23 at 10:11
  • 1
    @NourWolf Didnt put much more effort in this as we looked at thanos to completely manage the entire thing. But what I think I overlooked back then was the fact that a Snapshot is not a Backup. I think it works like Git versioning by self referencing the changes instead of storing the entire data. That way you can move forward and backward the snapshot timeline. Maybe an interesting read on the topic: https://ganeshvernekar.com/blog/prometheus-tsdb-snapshot-on-shutdown/ – Panade Mar 10 '23 at 12:15

1 Answers1

0

Prometheus snapshots use hard links in v2.1 onward. That explains the filesystem usage behaviour observed by the OP.

The snapshots are comprised of hard links of existing blocks, and a dump of the current open blocks. As hard links are in use this means that the snapshots of older blocks take no additional disk space as there's only one copy kept on disk, however you may break Prometheus if you change them, their permissions or their user/group. When you're done you can rm -rf the snapshot directory, as while the snapshot takes little additional disk space initially, once the original block gets deleted/compacted the snapshot would then be what is keeping that disk space used.

Source: https://www.robustperception.io/taking-snapshots-of-prometheus-data/

scetoaux
  • 1,289
  • 2
  • 12
  • 26