0

I am new to use Delta encoding to compress Hex data, I used the C implementation in wiki, so if my data is like 0xFFFFFFF,0xFFFFFFF,0xFFFFFFF,0xFFFFFFF,0xFFFFFFF the encoding result will be as follows : 0xFFFFFFF,0x0000000,0x0000000,0x0000000,0x0000000 , unlike rest of lossless algorithms which compression ratio = origial size / compressed size , i found that size of data will be fixed like before the compression, so how could i calculate compression ratio in delta encoding ? and how could i compress redundant delta ? The code is :

{
    unsigned char last = 0;
    for (int i = 0; i < length; i++)
    {
        unsigned char current = buffer[i];
        buffer[i] = current - last;
        last = current;
    }
}

void delta_decode(unsigned char *buffer, int length)
{
    unsigned char last = 0;
    for (int i = 0; i < length; i++)
    {
        unsigned char delta = buffer[i];
        buffer[i] = delta + last;
        last = buffer[i];
    }
} ```
 
Heb
  • 11
  • 3
  • @chux-ReinstateMonica I have edited the question – Heb Jul 08 '21 at 01:38
  • "so how could i calculate compression ratio in delta encoding ?" --> Note the the worst case compression ratio is >= 1.0. – chux - Reinstate Monica Jul 08 '21 at 01:50
  • @chux-ReinstateMonica Thanks yes i know, but the procedure of calculating compression ratio while the same data size remains is a missing part – Heb Jul 08 '21 at 02:16
  • @Heb There is no compression unless you do something like run-length compression, or Huffman coding, after converting the values to deltas. – user3386109 Jul 08 '21 at 03:02
  • @user3386109 Thanks , but will Huffman or RLE work with Hex data in numeric format ? – Heb Jul 08 '21 at 03:43
  • 1
    The [code from Wikipedia](https://en.wikipedia.org/wiki/Delta_encoding#Sample_C_code) you copied is borderline useless. It computes deltas, yes, but for every 8-bit sample in the input, it computes an 8-bit delta. Actually all but the first 8-bit sample. So it's N bytes in, N different bytes out. It's perfectly capable of reconstructing the original input, but the "compression ratio" is precisely: no compression at all (output is 100% of input, 0% savings). – Steve Summit Jul 08 '21 at 03:45
  • @SteveSummit Thanks , so what do you suggest I have to try in this case? – Heb Jul 08 '21 at 03:45
  • One possibility is to take, say, 8-bit differences between 16- or 32-bit input words. But then you have to decide what to do when the input has a big jump that won't fit in 8 bits. Another possibility, as user3386109 suggested, is [Huffman coding](https://en.wikipedia.org/wiki/Huffman_coding), which is reasonably straightforward to implement, but is a fair amount of work. – Steve Summit Jul 08 '21 at 03:47
  • You may want to check out Elias encoding (https://en.wikipedia.org/wiki/Elias_gamma_coding). The compression is mediocre, but it's a good place to start. – Neal Burns Jul 08 '21 at 15:48
  • You may also decide that this is simply not a productive avenue to have gone down. Delta encoding is great when the "quanta" you're differencing are large, with lots of redundancy. Delta encoding works great between multiple similar files, which is why it's often used in revision control systems. It's great between video frames, which is why it's often used in video encoding. It can be good between scanlines in a 2D image format, which is why it's used in some Facsimile formats. But individual bytes are small and typically don't have much redundancy to exploit. – Steve Summit Jul 08 '21 at 19:28
  • To clarify, it's possible in some circumstances to do data compression with delta encoding alone. For example if you knew _a priori_ that each value was always within +/- 127 of the previous value you could encode all the deltas as short int (8 bits). – 500 - Internal Server Error Jul 12 '21 at 10:07

1 Answers1

2

Delta encoding is a step before compression that does not, itself, compress. It enables the subsequent compression. You should then take the result of delta encoding and feed it to a standard lossless compressor to see how much it compresses.

Your question in the comments "but will Huffman or RLE work with Hex data in numeric format?" suggests some confusion on your part. The result of the delta encoding is the contents of the array in binary. Not the hexadecimal and text representation of that binary data.

Mark Adler
  • 101,978
  • 13
  • 118
  • 158
  • Mark didn't blow his own horn here, but for the benefit of readers who don't recognize his name, I hope he won't mind if I point out that when it comes to [compression](https://en.wikipedia.org/wiki/Zlib) and [delta encoding](https://en.wikipedia.org/wiki/Delta_encoding), he [does know what he's talking about](https://en.wikipedia.org/wiki/Adler-32). – Steve Summit Jul 08 '21 at 18:21