0

Much to my surprise, the disk space (reported by du) used by containerd for storing Docker images is apparently additive, so it is possible that multiple duplicates of the same container image are being kept, one duplicate per namespace.

I suspect it's just a measurement artifact. How can I get assured it's not a wasteful duplication?


Illustration

In this real-life example below taken from one of our DEV servers, containerd used by microk8s keeps 10 copies of the same (as verified by checksums) container image (rather heavy dev container with 8.2 GiB size), taking up 10 times more storage space (i.e. 90+ gigabytes in this single container case) than it arguably should (and it would under Docker and "good old" k8s [e.g. 1.11 used in OCP/OKD 3.11]). Now imagine what happens if users have say 10 such containers to choose from and there are 100+ such users...

# calculate total disk space occupied by all Docker images stored by `containerd`
101G    k8s.io/

# calculate disk space occupied by each copy of the container 
# (kept in 10 separate subfolders of k8s.io - one per each namespace)
8.2G    d8a5f98e90a4240465bce9011ee9855f38a1355d98001651692ac32b60de692a/
8.2G    c7be1e30a7c59c77d6c2b9e645af6fc231ee0e3626b76792d2c5fa76df501dc8/
8.2G    bf2a683ee785018d90f68ed474f757b11b70b173547a767f4d38e1bec6cb20f8/
8.2G    ab99b16fb660adf4e50abb28ac060db70cc7298624edc6a27431232f1faae6f2/
8.2G    93af9650770c16d30cd8d5053c2df3bf42dcd11507b385b99222f29eacd58a08/
8.2G    75813eff8ad2f13cc2e0ae5dcde21cf107dd0b5f8837e750d7af9b2cf9b37f91/
8.2G    51b566aaf4cd4e8015f5f1c11f6b1f365d7137ecca0359300c3fd57c6b268502/
8.2G    42940c4bca12ee98b5ef0ac82bb3406568cc1b0ada2c36b04b3c9805db947d24/
8.2G    402a3615985b1ebb8f0336da22ef61a3de1899839948504a1c57a23ee7f23ef9/
8.2G    03543e31148721edbcccbee33fba7ec03225aad8f5514508c1e65b80072128ab/
1020M   f1c59b268079406b551ab9c47af8f89a36e94fb9c783541ee9e1665b67118fb1/
1020M   da34eb37a3dc01b2192820232288d5cd088bac93283e2b9d7a235eb7fb38d06e/
1020M   c272009b8d419a427354baee14cb243bc1d49a26f908b7cd30661d8a66a3f587/
1020M   98adad916cf5b3eaa62fe631f97dace945bcfecbb799ba1c2b7ce6901040a780/
1020M   968a7897f10699e96df62a9bc3b8d742a1a048cf7481990531800760e1dcf58c/
1020M   940acce942de009b74073efde5f86c8c019933a3b48e0c3b69c94b559aa644d7/
1020M   874b981524dfd4a4bb96dc6a4abc80a9dd157e0ca0607f9c65328a94056c64f8/
1020M   3e0199e23bd4670acae87c2851e2e2963678c2654fdabb96af79194484770524/
1020M   32a7f6ab5729d4b6464189fc807ef560f2ca0caebcaaf33619f7bf311a129d44/
1020M   2c3a95b1a9531d3ff68984da01460dee050268e28c50cb92894ab8aea49b0183/
[..]


# modifying `du` settings does not affect measurements much
# du -sh c7be1e30a7c59c77d6c2b9e645af6fc231ee0e3626b76792d2c5fa76df501dc8
8.2G    c7be1e30a7c59c77d6c2b9e645af6fc231ee0e3626b76792d2c5fa76df501dc8
# du -sh --apparent-size c7be1e30a7c59c77d6c2b9e645af6fc231ee0e3626b76792d2c5fa76df501dc8
7.7G    c7be1e30a7c59c77d6c2b9e645af6fc231ee0e3626b76792d2c5fa76df501dc8
# du -sh --count-links c7be1e30a7c59c77d6c2b9e645af6fc231ee0e3626b76792d2c5fa76df501dc8
8.7G    c7be1e30a7c59c77d6c2b9e645af6fc231ee0e3626b76792d2c5fa76df501dc8



# let's also verify that images checksums are indeed the same
Now using project "ak-tst" on server "https://localhost:8080".
    Image ID:       docker.io/mirekphd/ml-scraper-py39-jup@sha256:6ab4c3bd81d9125eecef0906ce79440068690cb4a53ec123c50febbe5f2fe17b
Now using project "ks-tst" on server "https://localhost:8080".
    Image ID:       docker.io/mirekphd/ml-scraper-py39-jup@sha256:6ab4c3bd81d9125eecef0906ce79440068690cb4a53ec123c50febbe5f2fe17b
Now using project "tz-tst" on server "https://localhost:8080".
    Image ID:       docker.io/mirekphd/ml-scraper-py39-jup@sha256:6ab4c3bd81d9125eecef0906ce79440068690cb4a53ec123c50febbe5f2fe17b
Now using project "user1-tst" on server "https://localhost:8080".
    Image ID:       docker.io/mirekphd/ml-scraper-py39-jup@sha256:6ab4c3bd81d9125eecef0906ce79440068690cb4a53ec123c50febbe5f2fe17b
Now using project "user2-tst" on server "https://localhost:8080".
    Image ID:       docker.io/mirekphd/ml-scraper-py39-jup@sha256:6ab4c3bd81d9125eecef0906ce79440068690cb4a53ec123c50febbe5f2fe17b
Now using project "ak" on server "https://localhost:8080".
    Image ID:       docker.io/mirekphd/ml-scraper-py39-jup@sha256:6ab4c3bd81d9125eecef0906ce79440068690cb4a53ec123c50febbe5f2fe17b
Now using project "ks" on server "https://localhost:8080".
    Image ID:       docker.io/mirekphd/ml-scraper-py39-jup@sha256:6ab4c3bd81d9125eecef0906ce79440068690cb4a53ec123c50febbe5f2fe17b
Now using project "ml" on server "https://localhost:8080".
    Image ID:       docker.io/mirekphd/ml-scraper-py39-jup@sha256:6ab4c3bd81d9125eecef0906ce79440068690cb4a53ec123c50febbe5f2fe17b
Now using project "tz" on server "https://localhost:8080".
    Image ID:       docker.io/mirekphd/ml-scraper-py39-jup@sha256:6ab4c3bd81d9125eecef0906ce79440068690cb4a53ec123c50febbe5f2fe17b
Now using project "user1" on server "https://localhost:8080".
    Image ID:       docker.io/mirekphd/ml-scraper-py39-jup@sha256:6ab4c3bd81d9125eecef0906ce79440068690cb4a53ec123c50febbe5f2fe17b
Now using project "user2" on server "https://localhost:8080".
mirekphd
  • 4,799
  • 3
  • 38
  • 59
  • "How can I get assured it's not a wasteful duplication?" Have you tried testing it, running multiple instances of a large image to see if you exhaust your free disk space? – BMitch Jul 06 '23 at 11:15
  • Yes, @BMitch, we are running this 10-container node from the illustration, and have indeed used most of the space (the node is under disk pressure), hence the investigation in search of "waste". But the available physical space on the node is much smaller (40% of what `du` reports for `k8s.io`), so I suspect some double-counting is going on here (but to what extent I don't know). – mirekphd Jul 06 '23 at 11:22

1 Answers1

0

Given that microk8s uses containerd - a CRI-compatible container runtime - you can use crictl - a generic CLI for CRI-compatible container runtimes.

Having connected crictl to microk8s CRI socket (in case of microk8s installed by snap into a non-standard location /var/snap/microk8s/common/run/containerd.sock) the crictl images command can be used to list Docker images present on the node very much like docker images.

And it demonstrates that indeed only one copy of the image of interest is present on the node, despite being kept by microk8s in 10 separate subfolders of the k8s.io/ folder (one per every namespace where the container using this image is running):

$ crictl images | grep mirekphd
docker.io/mirekphd/ml-scraper-py39-jup             latest              2419ce3bacb4a       3.05GB

For completeness, here are crictl installation instructions targeting microk8s specifically.

mirekphd
  • 4,799
  • 3
  • 38
  • 59