I noticed that, when running some stress tests on a Kubernetes cluster, etcd
snapshot sizes didnt increase much, even as I added more and more stuff to my cluster.
I collected snapshots via:
etcdctl --endpoints="https://localhost:2379" --cacert="/etc/kubernetes/pki/etcd/ca.crt" -cert="/etc/kubernetes/pki/etcd/server.crt" --key=/etc/kubernetes/pki/etcd/server.key snapshot save jay.db
And compared them:
root@tkg-mgmt-vsphere-20221014024846-control-plane-mp642:/home/capv# ls -altr jay*
-rw------- 1 root root 34975776 Oct 24 17:33 jay.db
-rw------- 1 root root 35061792 Oct 24 17:55 jay2.db
-rw------- 1 root root 35217440 Oct 24 18:05 jay3.db
So... since im putting large amounts of data into my cluster in these tests... i was wondering, does etcd
storage usage grow linearly ? Or is it somehow compressed over time such that it doesnt ever "Get that big".
Ive seen related questions, such as etcd 3.5 db_size growing constantly, where it appears that compaction keeps the size low, so I supposed my real question is....
- What are the boundaries and limits of how much work compaction can do in an ever increasing kubernetes cluster of say, 100s, 1000s, 10s of thousands of objects, and beyond?
- Does compaction do an ever better job over time, due to increased amounts of similar or duplicate information content ?