0

Does anyone know of a state of the art LOSSY compression program for data BESIDES music and images? I need actual executable or comilable source code.

I am trying to compress AMillionRandomDigits.bin.

Idea is to lossily compress AMillionRandomDigits.bin, then store LOSSY_COMPRESSED(amillionrandomdigits.bin) + DIFF(LOSSY_UNCOMPRESSED, amillionrandomdigits.bin) http://www.stanford.edu/~hwang41/

Charles
  • 50,943
  • 13
  • 104
  • 142
  • 6
    What's it for? Unless the target domain is known, the concept of LOSSY is not DEFINED. – OrangeDog Jan 14 '11 at 20:23
  • 2
    I can't even come up with a use case for a lossy compression algorithm for anything other than music and images (and I presume video as well). In pretty much any other realm (e.g. finance, air traffic control etc.) any lossiness at all is unacceptable. – MusiGenesis Jan 14 '11 at 20:27
  • 1
    @MusiGenesis: And don't forget Speech (other codecs than for music are used) – Christian Ammer Jan 14 '11 at 20:48
  • 1
    @Christian: you're right, but I kind of mentally included speech in with music (since they're both audio). – MusiGenesis Jan 14 '11 at 22:22
  • I am trying to compress AMillionRandomDigits.bin. Idea is to lossily compress AMillionRandomDigits.bin, then store LOSSY_COMPRESSED(amillionrandomdigits.bin) + DIFF(LOSSY_UNCOMPRESSED, amillionrandomdigits.bin) http://www.stanford.edu/~hwang41/ –  Jan 19 '11 at 18:37
  • here is the result of my quick and dirty implementation of a lossy compressed million random digits. `0` –  May 09 '11 at 18:19
  • 1
    In all seriousness, lossy means you are losing data and compressing it in a way the the original data can never be recovered exactly only approximately. Image and Audio compression only loses what you can't perceive the difference of the original versus the reconstructed approximation to begin with. –  May 09 '11 at 18:21

1 Answers1

2

@user562688: Compressing a truly random number can't be done. The proof idea is that if you're trying to compress 100 bits to 90 bits, then you need all of the 2^100 strings to fit inside a space of size 2^90, which is too small. Therefore, there will be many collisions (at least 2^10 on average), which means that you cannot decode it back to the original string.

But to answer your original question, although the Johnson-Lindenstrauss algorithm isn't a compressing algorithm per-se it has some similar properties to what's done in image compression.

The goal of the Johnson-Lindentrauss algorithm is to take lots of vectors (say n vectors) in R^n, and to find a mapping to a much smaller space, R^log(n), such that the distances between all the vectors do not change by much.

Or Sattath
  • 159
  • 2
  • 4
  • 1
    the question was about lossy compression - so of course it is possible to compress random numbers and reduce 100 bits to 90. E.g. extreme case of such compression is just median or removing every second number, etc. – noonex Oct 28 '15 at 08:55