Questions tagged [compression]

The name given to the process of encoding data such that it uses lesser number of bits as compared to the original representation.

There are a number of techniques to compress information, with most algorithms working to compress image, speech, music or video. Encoding the same information in lesser bits results in saving some crucial resource such as transmission bandwidth or hard-disk space. Compression algorithms also take advantage of human perception to further reduce the size of information. By encoding the same piece of information with fewer bits in places that the human brain is not accustomed to perceiving as well as other parts, substantial gains in compression ratios can be achieved.

9175 questions
78
votes
3 answers

Why does base64-encoded data compress so poorly?

I was recently compressing some files, and I noticed that base64-encoded data seems to compress really bad. Here is one example: Original file: 429,7 MiB compress via xz -9: 13,2 MiB / 429,7 MiB = 0,031 4,9 MiB/s 1:28 base64 it and compress via xz…
Stefan Seidel
  • 9,421
  • 3
  • 19
  • 18
77
votes
9 answers

Why is *.tar.gz still much more common than *.tar.xz?

Whenever I see some source packages or binaries which are compressed with GZip I wonder if there are still reasons to favor gz over xz (excluding time travel to 2000), the savings of the LZMA compression algorithm are substantial and decompressions…
soc
  • 27,983
  • 20
  • 111
  • 215
75
votes
6 answers

Reducing video size with same format and reducing frame size

This question might be very basic Is there a way to reduce the frame size/rate of Lossy compressed (WMV, MPEG) format, to get a smaller video, of lesser size, with same format. Are there any open source or proprietary apis for this?
Vignesh
  • 1,130
  • 2
  • 12
  • 19
74
votes
5 answers

What is the meaning of O( polylog(n) )? In particular, how is polylog(n) defined?

Brief: When academic (computer science) papers say "O(polylog(n))", what do they mean? I'm not confused by the "Big-Oh" notation, which I'm very familiar with, but rather by the function polylog(n). They're not talking about the complex analysis…
Managu
  • 8,849
  • 2
  • 30
  • 36
74
votes
4 answers

Deleting files after adding to tar archive

Can GNU tar add many files to an archive, deleting each one as it is added? This is useful when there is not enough disk space to hold both the entire tar archive and the original files - and therefore it is not possible to simply manually delete…
Ivy
  • 3,393
  • 11
  • 33
  • 46
72
votes
10 answers

Compression formats with good support for random access within archives?

This is similar to a previous question, but the answers there don't satisfy my needs and my question is slightly different: I currently use gzip compression for some very large files which contain sorted data. When the files are not compressed,…
John Zwinck
  • 239,568
  • 38
  • 324
  • 436
70
votes
7 answers

How can I get gzip compression in IIS7 working?

I have installed Static and dynamic compression for IIS7, as well as setting the two web.config values at my application Virtual Folder level. As I understand it, I don't need to enable compression at the server, or site level anymore, and I can…
Russ
  • 12,312
  • 20
  • 59
  • 78
70
votes
2 answers

Python: Inflate and Deflate implementations

I am interfacing with a server that requires that data sent to it is compressed with Deflate algorithm (Huffman encoding + LZ77) and also sends data that I need to Inflate. I know that Python includes Zlib, and that the C libraries in Zlib support…
Demi
  • 6,147
  • 7
  • 36
  • 38
69
votes
6 answers

Why do real-world servers prefer gzip over deflate encoding?

We already know deflate encoding is a winner over gzip with respect to speed of encoding, decoding and compression size. So why do no large sites (that I can find) send it (when I use a browser that accepts it)? Yahoo claims deflate is "less…
Steve Clay
  • 8,671
  • 2
  • 42
  • 48
69
votes
16 answers

What is the computer science definition of entropy?

I've recently started a course on data compression at my university. However, I find the use of the term "entropy" as it applies to computer science rather ambiguous. As far as I can tell, it roughly translates to the "randomness" of a system or…
fluffels
  • 4,051
  • 7
  • 35
  • 53
66
votes
6 answers

When compressing and encrypting, should I compress first, or encrypt first?

If I were to AES-encrypt a file, and then ZLIB-compress it, would the compression be less efficient than if I first compressed and then encrypted? In other words, should I compress first or encrypt first, or does it matter?
Sei Satzparad
  • 1,137
  • 1
  • 9
  • 12
65
votes
3 answers

Does GZIP Compression Level Have Any Impact On Decompression

I understand that GZIP is a combination of LZ77 and Huffman coding and can be configured with a level between 1-9 where 1 indicates the fastest compression (less compression) and 9 indicates the slowest compression method (best compression). My…
David
  • 7,652
  • 21
  • 60
  • 98
65
votes
15 answers

How many times can a file be compressed?

I was thinking about compression, and it seems like there would have to be some sort of limit to the compression that could be applied to it, otherwise it'd be a single byte. So my question is, how many times can I compress a file before: It does…
samoz
  • 56,849
  • 55
  • 141
  • 195
64
votes
9 answers

How do you create a .gz file using PHP?

I would like to gzip compress a file on my server using PHP. Does anyone have an example that would input a file and output a compressed file?
AlBeebe
  • 8,101
  • 3
  • 51
  • 65
64
votes
4 answers

How to send a compressed archive that contains executables so that Google's attachment filter won't reject it

I have a directory that I want to compress to send it by e-mail, I've tried this: tar -cvf filename.tar.gz directory_to_compress/ But when I try to send it by e-mail, Google says: filename.tar.gz contains an executable file. For security reasons,…
slackmart
  • 4,754
  • 3
  • 25
  • 39