-1

I just tried to convert few JPEGs to a GIF image using some online services. For a collection of 1.8 MB of randomly selected JPEGs, the resultant GIF was about 3.8 MB in size (without any extra compression enabled).

I understand GIF is lossless compression. And that's why I expected the resultant output to be around 1.8 MB (input size). Can someone please help me understand what's happening with this extra space ?

Additionally, is there a better way to bundle a set of images which are similar to each other (for transmission) ?

dev
  • 11,071
  • 22
  • 74
  • 122

2 Answers2

1

JPEG is a lossy compressed file, but still it is compressed. When it uncompresses into raw pixel data and then recompressed into GIF, it is logical to get that bigger a size

GIF is worse as a compression method for photographs, it is suited for flat colored drawings mostly. It uses RLE [run-length encoding] if I remember well, that is you get entries in the compressed file that mean "repeat this value N times", so you need to have lots of same colored pixels in horizontal sequence to get good compression.

If you have images that are similar to each other, maybe you should consider packing them as consequtive frames (the more similar should be closer) of a video stream and use some lossless compressor (or even risk it with a lossy one) for video, but maybe this is an overkill.

George Birbilis
  • 2,782
  • 2
  • 33
  • 35
  • Thanks. My head was stuck up in thinking GIF is adding some kind of metadata; your explanation makes more sense. Since we are packing uncompressed data in GIF, I am also planning to try a lossy compression over the output GIF before trying a lossy video approach (I am hoping it should work the same as JPEG) – dev Oct 21 '15 at 02:58
  • 1
    if you try the video as similar image collection container approach for compression among frames, don't use MJPEG compressor, but try H.264 or WebM etc.. MJPEG compresses each frame separately, so unless you just want to use it for packing them together (could also use a ZIP/RAR/etc. set to no-compression or other multi-blob container for that btw) it doesn't help getting extra compression based on the image similarity. "...each video frame or interlaced field of a digital video sequence is compressed separately as a JPEG image." https://en.wikipedia.org/wiki/Motion_JPEG – George Birbilis Oct 21 '15 at 16:39
  • 1
    also see if you can find any DIFFing algorithm for images, esp. if they're of the same size, I can imagine an algorithm could use octree subdivision (https://en.wikipedia.org/wiki/Octree) and compare them, generating appropriate DIFFgrams from one image to the other. The trick in that case is to sort them by resemblance (lower produced DIFFgram size), which in brute-force approach would mean to DIFF all pairs first. Note that such compression with DIFFgrams would mean if first parts are corrupted in transmission the whole sequence fails to be restored. Just brainstorming though... – George Birbilis Oct 21 '15 at 16:45
0

If you have a color image, multiply the width x height x 3. That is the normal size of the uncompressed image data.

GIF and JPEG are two difference methods for compressing that data. GIF uses the LZW method of compression. In that method the encoder creates a dictionary of previously encountered data sequences. The encoder write codes representing sequences rather than the actual data. This can actual result in an file larger than the actual image data if the encode cannot find such sequences.

These GIF sequences are more likely to occur in drawing where the same colors are used, rather than in photographic images where the color varies subtly through out.

JPEG uses a series of compression steps. These have the drawback that you might not get out exactly what you put in. The first of these is conversion from RGB to YCbCr. There is not a 1-to-1 mapping between these colorspaces so modification can occur there.

Next is subsampling.The reason for going to YCbCr is that you can sample the Cb and Cr components at a lower rate than the Y component and still get good representation of the original image. If you do 1 Y to 4 Cb and 4 Cr you reduce the amount of data to compress by half.

Next is the discrete cosine transform. This is a real number calculation performed on integers. That can produce rounding errors.

Next is quantization. In this step less significant values from the DCT are discarded (less data to compress). It also introduces errors from integer division.

user3344003
  • 20,574
  • 3
  • 26
  • 62