When I calculate the entropy values of files compressed with Gzip, PKZIP, 7ZIP and Winrar, I find that the compression rate of Gzip is higher than the others. The entropy value is higher (indicating less redundancy) and the file size is smaller. Even for small files, the overhead of Gzip is lower compared to the other algorithms. To be fair, this is not the case for all file formats, e.g. for xlsx, 7- ZIP and PKzip have better results than Gzip and Winrar. But still. I'm quite surprised because 7- ZIP is generally considered a better compression algorithm in terms of.... it reduces the file size more, but that does not really correspond with my results. Or I did something completely wrong... or...?
I did not base these results on a few files, I compressed a whole bunch of things from different file formats and calculated the delta of the file sizes with Python.
What I also find quite interesting. When I look at PDF files, I would expect that especially PDF 1.5 or higher can hardly be compressed by a lossless compression algorithm, as they are already heavily compressed by themselves. But I don't see much difference between PDF < 1.5 and 1.5 >, both are compressed quite heavily by these compression tools.
By the way, I used the default algorithms and settings of these archivers
Can someone explain how/why this is the case (maybe I'm doing something wrong) or maybe these results does make sense (but I can't find something on the internet that does support this)?