Introduction: What I am working on.
Hello everyone! I am working on a Demosaicing algorithm which I use to transform images that have Bayer pattern into images that represent red, green and blue channels. I wish that the algorithm would have the following properties:
It preserves as much raw information as possible.
It does not obscure details in the image, even if that means absence of denoising.
It produces as little artifacts as possible.
If the size of mosaic image is N x N, the three color images should each have size N/2 x N/2.
Algorithm should be fast. To put "fast" into a context, let me say this: I will settle for something that is at least twice as fast as OpenCV's algorithm which uses bilinear interpolation.
What I have achieved so far.
So far, I've come up with the algorithm that uses bilinear interpolation and produces three images that have half size of the mosaic image. The algorithm is approximately 3-4 times faster than OpenCV's cvtColor algorithm that performs CV_BayerBG2BGR conversion (bilinear interpolation).
See the sketch of the Bayer pattern below to get an idea about how it works. I perform the interpolation at points marked by circles. The numbers represent the coefficients by which I multiply the underlying pixels in order to get interpolated value in the point marked by black circle.
You can observe the results of my algorithm below. I've also added the results of both demosaicing algorithms that are available in OpenCV (bilinear interpolation and variable number of gradients). Please note that while results of my algorithm look really poor in comparison, the OpenCV's bilinear interpolation results look almost exactly the same if I downsample them. This is of course expected as the underlying algorithm is the same.
... so finally: the question.
My current solution gives acceptable results for my project and it is also acceptably fast. However, I would be willing to use a up to twice slower algorithm if that would bring improvements to any of the 5 criteria listed above. The question then is: how to improve my algorithm without significantly hindering the performance?
I have enough programming experience for this task so I am not specifically asking for code snippets - the answers of any kind (code, links, suggestions - especially the ones based on past experiences) are welcome.
Some additional information:
- I am working in C++.
- The algorithm is highly optimized, it uses SSE instructions and it is non-parallel.
- I work with large images (few MB in size); cache-awareness and avoiding multiple passes through image are very important.
I am not looking for general programming advice (such as optimization in general, etc.), but on the other hand some task-specific answers are more than welcome. Thank you in advance.