0

This is more of a conceptual question, but if I have a File A and a File B, is it possible for Compressed(A) to equal Compressed(B) if A != B. Additionally, if the compressed representations of A and B match, are A and B guaranteed to match?

knowads
  • 705
  • 2
  • 7
  • 24
  • No, if the compressed images match for different A and B, that would be "lossy" compression. The question can compressed representations of A and B match, but A and B not, is exactly the same question. – Kaz Apr 02 '19 at 01:55
  • If this were possible, then decompression of either would result in the same thing, which could be A, B or something else. – Scott Hunter Apr 02 '19 at 01:55
  • So Zstd and Zlib are both bijective? Therefore I I simply want to tell if A != B, I don't have to decompress? – knowads Apr 02 '19 at 04:22
  • 1
    No, A could have been compressed with different versions of the library, or different compression levels, which would make compressed(A) != compressed(A). However, if compressed(A) == compressed(B) then A == B. – Nick Apr 02 '19 at 17:52

1 Answers1

3

The question seems not about bijection.

These algorithms would maybe be bijective if, for a given File A, there would be one and only one Compressed(A) possible.

It's obviously not the case : just play with the compression levels, you have multiple different versions of Compressed(A) which decompress back to the same File A. So it's not a bijection.

However, the other direction is guaranteed : a given Compressed(something) can regenerate one and only one something. And since the compression is lossless, it guarantees that if Compressed(A) == Compressed(B), then necessarily A == B.

But don't confuse that with bijection. When A == B, it doesn't follow that Compressed(A) == Compressed(B), as they may have been compressed differently (using different compression levels or other advanced parameters).

Cyan
  • 13,248
  • 8
  • 43
  • 78