Which files does not reduce its size after compression

Question

I have written a java program for compression. I have compressed some text file. The file size after compression reduced. But when I tried to compress PDF file. I dinot see any change in file size after compression.

So I want to know what other files will not reduce its size after compression.

Thanks Sunil Kumar Sahoo

score 12 · Answer 1 · answered Jul 16 '09 at 08:56

File compression works by removing redundancy. Therefore, files that contain little redundancy compress badly or not at all.

The kind of files with no redundancy that you're most likely to encounter is files that have already been compressed. In the case of PDF, that would specifically be PDFs that consist mainly of images which are themselves in a compressed image format like JPEG.

score 6 · Answer 2 · answered Jul 16 '09 at 08:55

6

jpeg/gif/avi/mpeg/mp3 and already compressed files wont change much after compression. You may see a small decrease in filesize.

answered Jul 16 '09 at 08:55

waqasahmed

3,555
6
32
52

score 5 · Answer 3 · answered Jul 16 '09 at 08:55

5

Compressed files will not reduce their size after compression.

answered Jul 16 '09 at 08:55

stefanw

10,456
3
36
34

5

That may not hold true, depending on the algorithms used. – Michael Foukarakis Sep 15 '09 at 10:23

score 4 · Answer 4 · answered Jun 23 '14 at 20:48

Five years later, I have at least some real statistics to show of this.

I've generated 17439 multi-page pdf-files with PrinceXML that totals 4858 Mb. A zip -r archive pdf_folder gives me an archive.zip that is 4542 Mb. That's 93.5% of the original size, so not worth it to save space.

score 3 · Answer 5 · answered Jul 16 '09 at 08:57

The only files that cannot be compressed are random ones - truly random bits, or as approximated by the output of a compressor.

However, for any algorithm in general, there are many files that cannot be compressed by it but can be compressed well by another algorithm.

score 2 · Answer 6 · answered Jul 16 '09 at 09:13

2

PDF files are already compressed. They use the following compression algorithms:

LZW (Lempel-Ziv-Welch)
FLATE (ZIP, in PDF 1.2)
JPEG and JPEG2000 (PDF version 1.5 CCITT (the facsimile standard, Group 3 or 4)
JBIG2 compression (PDF version 1.4) RLE (Run Length Encoding)

Depending on which tool created the PDF and version, different types of encryption are used. You can compress it further using a more efficient algorithm, loose some quality by converting images to low quality jpegs.

There is a great link on this here

http://www.verypdf.com/pdfinfoeditor/compression.htm

answered Jul 16 '09 at 09:13

badbod99

7,429
2
32
31

Not really. Not all PDF file automatically stores their content in compressed format. But you're right, PDF supports compression. Unless your PDF contains only images, there's a high probability that you could squeeze some extra space using ZIP or RAR – Salamander2007 Jul 16 '09 at 10:13
It 100% depends on the application which created the PDF, as mentioned in my post. – badbod99 Jul 21 '09 at 14:08

sharptooth · Answer 7 · 2009-07-16T10:03:32.363

2

Files encrypted with a good algorithm like IDEA or DES in CBC mode don't compress anymore regardless of their original content. That's why encryption programs first compress and only then run the encryption.

edited Jul 16 '09 at 10:03

answered Jul 16 '09 at 09:27

sharptooth

167,383
100
513
979

score 1 · Answer 8 · answered Jul 16 '09 at 08:55

1

Generally you cannot compress data that has already been compressed. You might even end up with a compressed size that is larger than the input.

answered Jul 16 '09 at 08:55

Martin Liversage

104,481
22
209
256

score 1 · Answer 9 · answered Jul 16 '09 at 09:11

1

You will probably have difficulty compressing encrypted files too as they are essentially random and will (typically) have few repeating blocks.

answered Jul 16 '09 at 09:11

Colin Desmond

4,824
4
46
67

score 0 · Answer 10 · answered Jul 16 '09 at 08:56

0

Media files don't tend to compress well. JPEG and MPEG don't compress while you may be able to compress .png files

answered Jul 16 '09 at 08:56

AutomatedTester

22,188
7
49
62

Actually JPEG and MPEG files can often be compressed a few percent by a good compression algorithm. – Michael Borgwardt Jul 16 '09 at 08:58
Are you sure? Remember that special-purpose compression algorithms often lose some data not important for the content (like noise in sound files or similar areas on images). That means they always have better compression ratio than any general purpose compression algorithms (mainy loss-less). – twk Jul 16 '09 at 09:05
But BMP files compress very well. This does not depend from type of a media, but from a compression type. And yes - file formats are some kind of a compression of information. – smok1 Jul 16 '09 at 09:18

score 0 · Answer 11 · answered Jul 16 '09 at 08:57

0

File that are already compressed usually can't be compressed any further. For example mp3, jpg, flac, and so on. You could even get files that are bigger because of the re-compressed file header.

answered Jul 16 '09 at 08:57

Federico klez Culloca

26,308
17
56
95

score 0 · Answer 12 · answered Jul 16 '09 at 08:58

Really, it all depends on the algorithm that is used. An algorithm that is specifically tailored to use the frequency of letters found in common English words will do fairly poorly when the input file does not match that assumption.

In general, PDFs contain images and such that are already compressed, so it will not compress much further. Your algorithm is probably only able to eke out meagre if any savings based on the text strings contained in the PDF?

score 0 · Answer 13 · answered Jul 16 '09 at 09:16

0

Simple answer: compressed files (or we could reduce file sizes to 0 by compressing multiple times :). Many file formats already apply compression and you might find that the file size shrinks by less then 1% when compressing movies, mp3s, jpegs, etc.

answered Jul 16 '09 at 09:16

soulmerge

73,842
19
118
155

score 0 · Answer 14 · answered Jul 16 '09 at 09:21

0

You can add all Office 2007 file formats to the list (of @waqasahmed):

Since the Office 2007 .docx and .xlsx (etc) are actually zipped .xml files, you also might not see a lot of size reduction in them either.

answered Jul 16 '09 at 09:21

GvS

52,015
16
101
139

i created an excel sheet with python xlsxwriter. When i resaved it with libreoffice calc the size decreased by more than 60 percent. why is it ? – bakarin Sep 12 '18 at 06:32

score 0 · Answer 15 · answered Dec 14 '12 at 15:26

0

Truly random
Approximation thereof, made by cryptographically strong hash function or cipher, e.g.:

AES-CBC(any input)

"".join(map(b2a_hex, [md5(str(i)) for i in range(...)]))

answered Dec 14 '12 at 15:26

Dima Tisnek

11,241
4
68
120

Gianluca Ghettini · Answer 16 · 2012-12-14T15:41:08.863

Any lossless compression algorithm, provided it makes some inputs smaller (as the name compression suggests), will also make some other inputs larger.

Otherwise, the set of all input sequences up to a given length L could be mapped to the (much) smaller set of all sequences of length less than L, and do so without collisions (because the compression must be lossless and reversible), which possibility the pigeonhole principle excludes.

So, there are infinite files which do NOT reduce its size after compression and, moreover, it's not required for a file to be an high entropy file :)

Which files does not reduce its size after compression

16 Answers16

Linked