-1

I have two compressed files that are identical. When I compute the md5sum I get different results. When I unzip them first and then compute the md5sum I get the results that are expected (that they are equal).

Is there a way to compare the files without unzipping them?

zcat | md5sum

Stef_92
  • 23
  • 5

1 Answers1

0

I see two likely possibilities as to why two compressed files would differ while containing the same compressed data:

  1. The compressed form of the data is different, but both files contain the same uncompressed data. This may be because a different gzip compression level was used, or because a different implementation of gzip was used to compress the files.

  2. The compressed data is identical, but the header of the compressed file differs, e.g. because metadata like the timestamp or original filename (which gzip stores in the header) are different, or are missing in one file. The metadata will differ based on how the compressed file was generated -- for example, compressing a file with gzip will include a filename and an original filesize, whereas compressing a stream will not.

Either way, you will need to uncompress the file to get a checksum of its contents.