2

I'm looking for a lossless data compression algorithm implementation that can run on a STM32L4. The data is ECG curves (so basically a set of 16 bits numerical values that are relatively close from one another).

I've found different implementations, for example miniz but they all use dynamic memory allocation (which I want to avoid) and are also pretty complicated and resource consuming.

I've read this post but there isn't really an answer. I'd like to avoid to modify an existing implementation to get rid of dynamic allocation, since this functionality (data compression) is not my main priority.

I don't need a fancy state of the art algorithm, but rather a simple, resource limited algorithm to save some bandwith when sending my data over the air, even if the compression ratio is not the best.

Do you have any idea of an algorithm that may fit ?

Guillaume Petitjean
  • 2,408
  • 1
  • 21
  • 47
  • 1
    share somewhere sample of the data or how much close are the data between samples ? how deep compression do you except ? – vlk Aug 03 '18 at 09:11
  • 1
    Here are few consecutive samples: -12,-12,-12,-11,-15,-8,-16,-29,28,169,327,217,-79,-91,-59,-41,-36,-29,-26,-24,-22,-19,-14,-14,-12,-10,-10,-5,-2,1,5,10,15,17,21,22,20,20,15,9,2,-6 – Guillaume Petitjean Aug 03 '18 at 09:52
  • I don't really have a target in terms of compression, it's more a tradeoff between compression ratio and required CPU resources (code and RAM size and also computation time). – Guillaume Petitjean Aug 03 '18 at 09:53

2 Answers2

1

I was using https://github.com/pfalcon/uzlib

It uses malloc but it very easy to amend and use fixed size buffers.

Take a look a try.

0___________
  • 60,014
  • 4
  • 34
  • 74
  • Seems that the data are too random for `uzlib` algorithms. not sure if the compressed size will be significantly smaller. – vlk Aug 03 '18 at 13:04
  • if you don't try you will never know :) – 0___________ Aug 03 '18 at 13:14
  • I'm surprised that in the default configuration the lib uses at least 16KB of stack. IMHO it's not really compatible with "deeply embedded" systems (which seems to be the target). Of course this can be reduced but with significant impact on performance ? – Guillaume Petitjean Aug 06 '18 at 12:29
  • It is so easily amendable. – 0___________ Aug 06 '18 at 12:29
  • You're probably right. It was just a comment for people looking to use the library "as it is". – Guillaume Petitjean Aug 06 '18 at 12:35
  • @vlk, I did only quick tests. The compression ratio is not so good. 2000 bytes compress in around 1500 bytes with 12 bits hashes. Which is a bit surprising because in practice the input byte stream I try to compress looks like (16 bits samples): F4 FF F4 FF F4 FF F5 FF F1 FF F8 FF so it seems there is room for a good compression ratio. – Guillaume Petitjean Aug 06 '18 at 12:45
  • "It is so easily amendable." Not so much. Reducing the hash size is obvious but it makes the compression ratio really bad. I ended up designing my own specific compression algorithm. – Guillaume Petitjean Aug 07 '18 at 06:49
0

On base of your data example, you can make your own and very simple compression, with no external library, faster and maybe with better compression ratio.

If you look on your data the difference between numbers is often less than size of 8 bit integer (int8_t), which can handle numbers betwee -128 .. + 127.

This mean that you can store always difference between these numbers if the range will be between -127 .. + 127.

Number -128 (0xff) can be magic and this mean that this number will be followed with one 16 bit number. This magic number will be also used as synchronization number and also on begin.

Or use instead of 8 bit number 4 bit number (but this will be little more complex) (magic number will be -8 and it store range in -7 .. +7. You store in one byte two numbers.

So, in case of your example:

input    |  output 8bit  |  output 4bit
int16    |  int8  int16  |  int4  int16
---------+---------------+---------------
  -12    |  -128    -12  |    -8    -12
  -12    |     0         |     0
  -12    |     0         |     0
  -11    |     1         |     1
  -15    |    -4         |    -4
   -8    |     7         |     7
  -16    |    -8         |    -8    -16
  -29    |   -13         |    -8    -29
   28    |    57         |    -8     28
  169    |  -128    169  |    -8    141
  327    |  -128    327  |    -8    158
  217    |  -110         |    -8    217
  -79    |  -128    -79  |    -8    -79
  -91    |   -12         |    -8
  -59    |    32         |    -8    -59
  -41    |    18         |    -8    -41
  -36    |     5         |     5
  -29    |     7         |     7
  -26    |     3         |     3
  -24    |     2         |     2
  -22    |     2         |     2
  -19    |     3         |     3
  -14    |     5         |     5
  -14    |     0         |     0
  -12    |     2         |     2
  -10    |     2         |     2
  -10    |     0         |     0
   -5    |     5         |     5
   -2    |     3         |     3
    1    |     3         |     3
    5    |     4         |     4
   10    |     5         |     5
   15    |     5         |     5
   17    |     2         |     2
   21    |     4         |     4
   22    |     1         |     1
   20    |    -2         |    -2
   20    |     0         |     0
   15    |    -5         |    -5
    9    |    -6         |    -6
    2    |    -7         |    -7
   -6    |    -8         |    -8    -6
---------+---------------+---------------------
   42    |    42      4  |    42    11     count
   84    |    42      8  |    21    22     bytes
   84    |        50     |       43        bytes total
  100%   |        60%    |       51%       compression ratio

So, as you can see, with very simple algorithm you can get very good result.

Also it is possible to find other improvements of this algorithm, for example group same data, or also compress 16 bit data data after magic number. for example after magic number you can specify number of followed 16 bit (uncompressed numbers)

But everything depends on your data.

vlk
  • 2,581
  • 3
  • 31
  • 35