I'm building raw disk images (ie, dd, chroot to install linux). During the customization process I may delete files, use temporary files, etc.
What is the best way to delete these files to ensure the image is most compressible?
I'm assuming if I simply rm the file, it's just deleting records from the FAT to mark the blocks as available. This leaves the data in place, so when I gzip or bzip2 the image it still has to pack that data up. I assume things would be a lot tighter if I could tell the FS to write zeros to the blocks instead.
A bit of detail: these are CentOS 6.4 installs on ext4, but I would expect the answer applies to most linux distros using most file systems. The base filesystem I generate is via a command like dd if=/dev/zero of=filesystem.image bs=1M count=10240
. A typical 10GB disk image from a vanilla install will compress down to roughly 500MB. I bet if I did a more aggressive cleanup of temp files and such, I could get it a lot tighter.
Thanks!