0

I would like to use an image processing algorithm similar to some of the steps that JPEG compression uses for the Y component. However, instead of using the Discrete Cosine Transformation scipy.fftpack.dct, in which we obtain the weights for each of the x and y combination matrix, and then we follow by quantisation using a quantisation rubric, I would like to use Gaussians as base instead of a cosine function.

The expected outcome of the algorithm would entail:

  • The input image is processed with a small kernel, which only looks into a subset of the total image size (e.g. a 9x9 px)
  • The kernel is made of Discrete Gaussian combinations for both the x and the y (n x m) (like the 64 cosine combinations used for JPEG compression, but using Gaussians)
  • The kernel takes the subset array and computes a weight for each of the Discrete Gaussian combinations, and returns the weights as an n x m matrix
  • Then I can create a quantisation rubric, a divide this rubric by the Discrete Gaussian combination matrix, and most of the components will result in 0 (depending on the rubric thresholds used).
  • This would allow me to compress my image into only a few of the Gaussian components for each image subset.

From my limited maths understanding, this should be mathematically possible as the Gaussian function used for the kernel can broken down in the x and y axis (which will speed the processing as the kernel can be linear and not cubic).

Question: Is there any algorithm/method that allows for all or some of the steps described (I use Python)? I am not sure what I am looking for in terms of terminology.

Thank you!

  • Are you picking the Gaussian here because you think it will improve compression? Or did you just pick a random function to replace the sine with? I don’t think you can create an orthogonal base using Gaussians, which means you wouldn’t be able to recreate the input from your decomposition. You could look into wavelet decomposition, but you’d probably be reinventing JPEG2000. – Cris Luengo Jan 26 '21 at 15:12
  • I am picking a Gaussian function because the features in my data are circular (Gaussian like). If it is impossible to have an orthogonal base using Gaussians, I guess this approach is no longer possible... – Jordi Ferrer Jan 26 '21 at 18:06
  • Well, you could *detect* the Gaussians, and store only their origin and width. It'd be a strong simplification of the image, but maybe exactly what you need. It certainly would compress the image strongly, depending on the number of Gaussians in the image. – Cris Luengo Jan 26 '21 at 18:27

0 Answers0