6

I am trying to use FFTW for image convolution.

At first just to test if the system is working properly, I performed the fft, then the inverse fft, and could get the exact same image returned.

Then a small step forward, I used the identity kernel(i.e., kernel[0][0] = 1 whereas all the other components equal 0). I took the component-wise product between the image and kernel(both in the frequency domain), then did the inverse fft. Theoretically I should be able to get the identical image back. But the result I got is very not even close to the original image. I am suspecting this has something to do with where I center my kernel before I fft it into frequency domain(since I put the "1" at kernel[0][0], it basically means that I centered the positive part at the top left). Could anyone enlighten me about what goes wrong here?

ZV1
  • 61
  • 1
  • 4

3 Answers3

3

For each dimension, the indexes of samples should be from -n/2 ... 0 ... n/2 -1, so if the dimension is odd, center around the middle. If the dimension is even, center so that before the new 0 you have one sample more than after the new 0.

E.g. -4, -3, -2, -1, 0, 1, 2, 3 for a width/height of 8 or -3, -2, -1, 0, 1, 2, 3 for a width/height of 7.

The FFT is relative to the middle, in its scale there are negative points.
In the memory the points are 0...n-1, but the FFT treats them as -ceil(n/2)...floor(n/2), where 0 is -ceil(n/2) and n-1 is floor(n/2)

The identity matrix is a matrix of zeros with 1 in the 0,0 location (the center - according to above numbering). (In the spatial domain.)

In the frequency domain the identity matrix should be a constant (all real values 1 or 1/(N*M) and all imaginary values 0).

If you do not receive this result, then the identify matrix might need padding differently (to the left and down instead of around all sides) - this may depend on the FFT implementation.

Center each dimension separately (this is an index centering, no change in actual memory).

You will probably need to pad the image (after centering) to a whole power of 2 in each dimension (2^n * 2^m where n doesn't have to equal m).

Pad relative to FFT's 0,0 location (to center, not corner) by copying existing pixels into a new larger image, using center-based-indexes in both source and destination images (e.g. (0,0) to (0,0), (0,1) to (0,1), (1,-2) to (1,-2))

Assuming your FFT uses regular floating point cells and not complex cells, the complex image has to be of size 2*ceil(2/n) * 2*ceil(2/m) even if you don't need a whole power of 2 (since it has half the samples, but the samples are complex).

If your image has more than one color channel, you will first have to reshape it, so that the channel are the most significant in the sub-pixel ordering, instead of the least significant. You can reshape and pad in one go to save time and space.

Don't forget the FFTSHIFT after the IFFT. (To swap the quadrants.)
The result of the IFFT is 0...n-1. You have to take pixels floor(n/2)+1..n-1 and move them before 0...floor(n/2).
This is done by copying pixels to a new image, copying floor(n/2)+1 to memory-location 0, floor(n/2)+2 to memory-location 1, ..., n-1 to memory-location floor(n/2), then 0 to memory-location ceil(n/2), 1 to memory-location ceil(n/2)+1, ..., floor(n/2) to memory-location n-1.

When you multiply in the frequency domain, remember that the samples are complex (one cell real then one cell imaginary) so you have to use a complex multiplication.

The result might need dividing by N^2*M^2 where N is the size of n after padding (and likewise for M and m). - You can tell this by (a. looking at the frequency domain's values of the identity matrix, b. comparing result to input.)

Danny Varod
  • 17,324
  • 5
  • 69
  • 111
  • 1. But for the input as an image, there isn't negative sample points right? 2. By channel, you mean color channel? 3. And could you please explain a little more about the FFTSHIFT? – ZV1 Jun 25 '12 at 22:06
  • @ZV1 See updated edit regarding values' scalling and regarding frequency domain of identity image. – Danny Varod Jun 26 '12 at 19:35
0

I think that your understanding of the Identity kernel may be off. An Identity kernel should have the 1 at the center of the 2D kernal not at the 0, 0 position.

example for a 3 x 3, you have yours setup as follows:

1, 0, 0
0, 0, 0
0, 0, 0

It should be

0, 0, 0
0, 1, 0
0, 0, 0

Check this out also

What is the "do-nothing" convolution kernel

also look here, at the bottom of page 3.

http://www.fmwconcepts.com/imagemagick/digital_image_filtering.pdf

Community
  • 1
  • 1
trumpetlicks
  • 7,033
  • 2
  • 19
  • 33
  • Yeah theoretically it should be at the center. But when we perform the 2d discrete fourier transform of a kernel, we should move the center right? And if it is a more complex kernel, a Gaussian for example, how should I center it? BTW, what should the identity kernel look like in the frequency domain? I may do some debugging in my code. Thank you. – ZV1 Jun 25 '12 at 19:40
  • it heavily depends on what you're trying to accomplish, in your unity case though, it should be centered as above! – trumpetlicks Jun 25 '12 at 19:54
  • Ultimately I will need to take the convolution between the image and a Gaussian-like kernel using fftw. In that case, how should I place the kernel? – ZV1 Jun 25 '12 at 20:32
  • Same, a gaussian filter should be kernel centered! – trumpetlicks Jun 25 '12 at 20:37
-1

I took the component-wise product between the image and kernel in frequency domain, then did the inverse fft. Theoretically I should be able to get the identical image back.

I don't think that doing a forward transform with a non-fft kernel, and then an inverse fft transform should lead to any expectation of getting the original image back, but perhaps I'm just misunderstanding what you were trying to say there...

twalberg
  • 59,951
  • 11
  • 89
  • 84
  • In this case, with a true identity kernel, going forward or reverse should do nothing to the image!!! That is indeed the property of an identity kernel. Look at the bottom of page 3 of the following! http://www.fmwconcepts.com/imagemagick/digital_image_filtering.pdf – trumpetlicks Jun 25 '12 at 20:01
  • @trumpetlicks - An identity transform alone, or followed by a forward fft and then an inverse fft, yes. However, as written, it was an identity transform followed by inverse fft, which should result in a scaled fft result, *not* the original image. – twalberg Jun 25 '12 at 20:06
  • I guess maybe what I missed is that the identity kernel was being applied in the frequency domain, implying forward fft was already done - just not explicitly stated... Sometimes I read too fast. – twalberg Jun 25 '12 at 20:10
  • good point, my initial reading lead me to believe he was questioning the identity kernel itself, which it seems he also had shifted to the top-right. – trumpetlicks Jun 25 '12 at 20:11
  • LOL, maybe we're both incorrect :-) His sentence "I took the component-wise product between the image and kernel in frequency domain, then did the inverse fft" has me baffled as to whether he applied the kernel to an already FFTed version of the image, then did the ifft on that, or what??? – trumpetlicks Jun 25 '12 at 20:11
  • What I understood is that he did ImageOut=IFFT(FFT(ImageIn)). Which should lead to an almost identical result (almost since frequency range is finite), if he did it right (indexing, padding, shifting). – Danny Varod Jun 26 '12 at 14:52
  • At first I did ImageOut=IFFT(FFT(ImageIn)). Then I tried ImageOut=IFFT(FFT(ImageIn)*FFT(IdentityKernel)) but failed. – ZV1 Jun 27 '12 at 00:07