How can I measure the efficiency of a container image, in terms of what portion of its contents are actually used (accessed) for the processes therein?
There are various forms of wastage that could contribute to excessively large images, such as layers storing files that are superseded in later layers (which can be analysed using dive
), or binaries interlaced with unstripped debug information, or the inclusion of extraneous files (or data) that are simply not needed for the process which executes in the container. Here I'm asking about the latter.
Are there docker-specific tools (analogous to dive
) for estimating/measuring this kind of wastage/efficiency, or should I just apply general Linux techniques? Can the filesystem access time (atime) be relied upon inside a container (to distinguish which files have/haven't been read since the container was instantiated) or do I need to instrument the image with tools like the Linux auditing system (auditd
)?