Approaches for efficient compression of images with several focus planes

Question

I am working on a application where images at different focus planes are aquired and currently stored inside a multipage tif. Unfortunately the tif-based compression techniques does not benefit from the signal redundancy over the different focus planes.

I found some resourcs about this here ZPEG and here JPEG2000 Addon

unfortunately they are all far away from a standard.

I was wondering if there is probably a video codec which could achive great compression ratios in this scenario?

I am also very open very any other ideas.

score 1 · Answer 1 · answered Dec 23 '15 at 21:30

Here's a different approach: turning the cross-plane redundancy into spatial redundancy and then using standard image compression.

In the simplest way, just take strips of width*1 pixel, from every plane, and stack them. As an image, that will look vertically smeared in a weird way. It's best if this lines up with DCT blocks (if applicable) to avoid having a sharp horizontal edge through a block, so it should probably be padded to a multiple of (usually) 8 planes by duplicating a plane. You could gain a bit more by optimizing the padding for minimum energy, but that's complicated whereas duplicating is already pretty good and trivial.

It obviously wouldn't compress well with unfiltered lossless compression, but PNG with a suitable filter (up, average or paeth) should work.

Thats also a great idea. Panasonic GX8 Post Focus Feature is very simular to my problem and they solved it using a mp4 file. Likely with h.264 compression. https://www.youtube.com/watch?v=GdQGg9U1TGk — Andreas, Dec 27 '15 at 16:51

score 1 · Answer 2 · answered Jan 11 '16 at 13:35

The problem with tiff is that it does not support inter-component decorrelation in its baseline. There are some extensions (not very broadly supported) that allow storing other file compression formats (such as a complete JPEG2000 JP2 file, extension 0x8798), but it is not guaranteed that an standard decoder will process it correctly.

If you can use any tool you want, close to optimal coding performance is probably obtained with a good spectral decorrelation transform (the KLT for lossy compression and the RKLT for lossless compression - see http://gici.uab.cat/GiciWebPage/downloads.php#spectral for a JAVA implementation of these transform) and then a good compression algorithm such as JPEG2000. On the other hand, this approach can be a bit complicated to implement and slow due to the KLT/RKLT transforms.

Other simpler approach is to simply use JPEG2000 with the DWT for spectral decorrelation. For instance, if you use the Kakadu implementation (kakadusoftware.com), you just need to use the proper parameters when compressing. Here you have an example invocation extracted from http://kakadusoftware.com/wp-content/uploads/2014/06/Usage_Examples.txt,

Ai) kdu_compress -i catscan.rawl*35@524288 -o catscan.jpx -jpx_layers *
             -jpx_space sLUM Creversible=yes Sdims={512,512} Clayers=16
             Mcomponents=35  Msigned=no  Mprecision=12
             Sprecision=12,12,12,12,12,13  Ssigned=no,no,no,no,no,yes
             Mvector_size:I4=35 Mvector_coeffs:I4=2048
             Mstage_inputs:I25={0,34}  Mstage_outputs:I25={0,34}
             Mstage_collections:I25={35,35}
             Mstage_xforms:I25={DWT,1,4,3,0}
             Mnum_stages=1  Mstages=25
-- Compresses a medical volume consisting of 35 slices, each 512x512,
   represented in raw little-endian format with 12-bits per sample,
   packed into 2 bytes per sample.  This example follows example (x)
   above, but adds a multi-component transform, which is implemented
   using a 3 level DWT, based on the 5/3 reversible kernel (the kernel-id
   is 1, which is found in the second field of the `Mstage_xforms' record.

-- To decode the above parameter attributes, note that:

   a) There is only one multi-component transform stage, whose instance

      index is 25 (this is the I25 suffix found on the descriptive

      attributes for this stage).  The value 25 is entirely  arbitrary.  I

      picked it to make things interesting.  There can, in general, be

      any number of transform stages.

   b) The single transform stage consists of only one transform block,

      defined by the `Mstage_xforms:I25' attribute -- there can be

      any number of transform blocks, in general.

   c) This block takes 35 input components and produces 35 output

      components, as indicated by the `Mstage_collections:I25' attribute.

   d) The stage inputs and stage outputs are not permuted in this example;

      they are enumerated as 0-34 in each case, as given by the

      `Mstage_inputs:I25' and `Mstage_outputs:I25' attributes.

   e) The transform block itself is implemented using a DWT, whose kernel

      ID is 1 (this is the Part-1 5/3 reversible DWT kernel).  Block

      outputs are added to the offset vector whose instance index is 4

      (as given by `Mvector_size:I4' and `Mvector_coeffs:I4') and the

      DWT has 3 levels.  The final field in the `Mstage_xforms' record

      is set to 0, meaning that the canvas origin for the multi-component

      DWT is to be taken as 0.

   f) Since a multi-component transform is being used, the precision

      and signed/unsigned properties of the final decompressed (or

      original compressed) image components are given by `Mprecision'

      and `Msigned', while their number is given by `Mcomponents'.

   g) The `Sprecision' and `Ssigned' attributes record the precision

      and signed/unsigned characteristics of what we call the codestream

      components -- i.e., the components which are obtained by block

      decoding and spatial inverse wavelet transformation.  In this

      case, the first 5 are low-pass subband components, at the bottom

      of the DWT tree; the next 4 are high-pass subband components

      from level 3; then come 9 high-pass components from level 2 of

      the DWT; and finally the 17 high-pass components belonging to

      the first DWT level.  DWT normalization conventions for both

      reversible and irreversible multi-component transforms dictate

      that all high-pass subbands have a passband gain of 2, while

      low-pass subbands have a passband gain of 1.  This is why all

      but the first 5 `Sprecision' values have an extra bit -- remember

      that missing entries in the `Sprecision' and `Ssigned' arrays

      are obtained by replicating the last supplied value.

Approaches for efficient compression of images with several focus planes

2 Answers2