1

I would appreciate some help understanding the output of a small educational program I put together. I am new to OpenCV and don't have much C++ experience.

The goal of the script is to perform the following:

  • Load an image
  • Perform DFT on the image
  • Apply a circular binary mask to the spectrum, where the radius of the circle can be increased by hitting a key on the keyboard (essentially applying a very crude filter to the image)
  • Display the result of the inverse transform of the spectrum after the mask was applied

I have the basic functionality working: I can load the image, perform DFT, view the output image and increase the radius of the circle (advancing through a for-loop with the circle radius following i), and see the result of the inverse transform of the modified spectrum.

However I do not understand why the output is showing a vertically flipped copy of the input superimposed on the image (see example below with Lena.png). This is not my intended result. When I imshow() the inverse DFT result without applying the mask, I get a normal, non-doubled image. But after applying the mask, the IDFT output looks like this:

Output when normal "Lena.png" is given as input image

I am not looking for a quick solution: I would really appreciate if someone more experienced could ask leading questions to help me understand this result so that I can try to fix it myself.


My code:

#include <opencv2/core/core.hpp>
#include <opencv2/highgui.hpp>
#include <opencv2/imgcodecs.hpp>
#include <opencv2/imgproc.hpp>
#include <iostream>

using namespace cv;

void expand_img_to_optimal(Mat &padded, Mat &img);
void fourier_transform(Mat &img);

int main(int argc, char **argv)
{
    Mat input_img;
    input_img = imread("Lena.png" , IMREAD_GRAYSCALE);

    if (input_img.empty())
    {
        fprintf(stderr, "Could not Open image\n\n");
        return -1;
    }

    fourier_transform(input_img);
    return 0;
}

void fourier_transform(Mat &img)
{
    Mat padded;
    expand_img_to_optimal(padded, img);

    Mat planes[] = {Mat_<float>(padded), Mat::zeros(padded.size(), CV_32F)};
    Mat complexI;
    merge(planes, 2, complexI);

    dft(complexI, complexI, DFT_COMPLEX_OUTPUT);
    
    // For-loop to increase mask circle radius incrementally
    for (int i=0; i<400; i+=10) {
        Mat mask = Mat::ones(complexI.size(), complexI.type());
        mask.convertTo(mask, CV_8U);

        Mat dest = Mat::ones(complexI.size(), complexI.type());

        circle(mask, Point(mask.cols/2, mask.rows/2), i, 0, -1, 8, 0);

        complexI.copyTo(dest, mask);    

        Mat inverseTransform;
        idft(dest, inverseTransform, DFT_INVERSE|DFT_REAL_OUTPUT);
        normalize(inverseTransform, inverseTransform, 0, 1, NORM_MINMAX);
        imshow("Reconstructed", inverseTransform);
        waitKey(0);
    }
}

void expand_img_to_optimal(Mat &padded, Mat &img) {
    int row = getOptimalDFTSize(img.rows);
    int col = getOptimalDFTSize(img.cols);
    copyMakeBorder(img, padded, 0, row - img.rows, 0, col - img.cols, BORDER_CONSTANT, Scalar::all(0));
}
avi
  • 170
  • 9

2 Answers2

4

This happens because you are inverse-transforming a frequency spectrum that is not conjugate-symmetric around the origin.

The origin of the frequency domain image is the top-left pixel. Your disk mask must be centered there. The frequency domain is periodic, so that the part of the mask that extends to the left of the image wraps around and comes in to the right edge, same with top and bottom edges.

The easiest way to generate a proper mask is to

  1. create it with the origin at (mask.cols/2, mask.rows/2), like you already do, and then
  2. apply the ifftshift operation.

OpenCV doesn’t have a ifftshift function, this answer has code that implements the ifftshift correctly.

Cris Luengo
  • 55,762
  • 10
  • 62
  • 120
  • Thank you for your helpful reply. To clarify, I should apply the `ifftshift` operation to the mask before the `complexI.copyTo(dest, mask_shifted)` line? I did this using the implementation from the response linked below, but I am seeing the same result. Happy to share more information if needed. https://stackoverflow.com/questions/29226465/fftshift-c-implemetation-for-opencv#:~:text=//%20reverse%20the%20swapping,q1)%3B%0A%0A%20%20%20%20return%20ret%3B%0A%7D – avi Dec 04 '22 at 18:18
  • @avi Yes, you need to modify `mask`. If you get the same result, your code changes didn’t affect the data. The linked post has many answers, which one did you use? Assuming your array is even in size, `fftshift` and `ifftshift` are the same, but if it’s odd in size (`getOptimalDFTSize` can return odd sizes) then the two are different. Make sure the pixel at index size/2 is moved to index 0. – Cris Luengo Dec 04 '22 at 18:33
  • Here is my implementation for an `ifftshift` for one image line, it needs to be run over each row and each column, and works in-place: https://github.com/DIPlib/diplib/blob/1256cfbc55abf968600e7560cdf513ea4032e468/src/transform/fourier.cpp#L62 – Cris Luengo Dec 04 '22 at 18:35
  • Ah - perhaps the "link to highlight" feature didn't work. I was referring to the response in that thread by giri. – avi Dec 04 '22 at 18:40
  • @avi yeah, don’t use that one, it crops odd-sized arrays to be even-sized. :( — Oh, it seems that they all assume even-sized arrays. Doh! – Cris Luengo Dec 04 '22 at 18:42
  • @avi I’ve added some simple code here, I hope it’s correct. It was not easy to type this on a phone! :) – Cris Luengo Dec 04 '22 at 19:00
  • Thank you very much! I have tried your code and three additional implementations, with the same result. So there must be something else at play here. I'll keep working on it. I certainly don't expect you to pick through my code -- but if you're interested in taking a look I'll keep it updated here (link jumps to ifft function call): https://github.com/avielbr/fourier_draw/blob/f6b2ae834cf6d66ce4995be03cc880cf69fd699a/main.cpp#L46 – avi Dec 04 '22 at 19:11
  • See my answer posted below. Looks like I found the root of the problem... Can't say I understand it, but I'm working on that part :P – avi Dec 04 '22 at 19:53
1

Firstly I'd like to thank @Cris Luengo for his helpful input on implementing the ifftshift in OpenCV.

In the end, the problem with my code was in this line:

Mat mask = Mat::ones(complexI.size(), complexI.type());

Instead of using the type of complexI, it looks like I should have used the type of img:

Mat mask = Mat::ones(complexI.size(), img.type());

Why? I'm not sure yet. Still trying to understand. Here is my complete code that is working how I intended:

#include <opencv2/core/core.hpp>
#include <opencv2/highgui.hpp>
#include <opencv2/imgcodecs.hpp>
#include <opencv2/imgproc.hpp>
#include <iostream>

using namespace cv;

void expand_img_to_optimal(Mat &padded, Mat &img);
void fourier_transform(Mat &img);
void ifft_shift(Mat &mask);                

int main(int argc, char **argv)
{
    Mat input_img;
    input_img = imread("Lena.png" , IMREAD_GRAYSCALE);

    if (input_img.empty())
    {
        fprintf(stderr, "Could not Open image\n\n");
        return -1;
    }

    fourier_transform(input_img);
    return 0;
}

void fourier_transform(Mat &img)
{
    Mat padded;
    expand_img_to_optimal(padded, img);

    Mat planes[] = {Mat_<float>(padded), Mat::zeros(padded.size(), CV_32F)};
    Mat complexI;
    merge(planes, 2, complexI);

    dft(complexI, complexI, DFT_COMPLEX_OUTPUT);

    for (float i=0; i<4000; i+=2) {
        // Create disk mask matrix
        Mat mask = Mat::ones(complexI.size(), CV_8U);
        circle(mask, Point(mask.cols/2, mask.rows/2), i, 0, -1, 8, 0);

        // Perform ifft shift
        ifft_shift(mask);

        // Destination matrix for masked spectrum
        Mat dest;
        complexI.copyTo(dest, mask);

        // Perform inverse DFT
        Mat inverseTransform;
        idft(dest, inverseTransform, DFT_INVERSE|DFT_REAL_OUTPUT);
        normalize(inverseTransform, inverseTransform, 0, 1, NORM_MINMAX);
        imshow("Reconstructed", inverseTransform);
        waitKey(0);
    }
}

void expand_img_to_optimal(Mat &padded, Mat &img) {
    int row = getOptimalDFTSize(img.rows);
    int col = getOptimalDFTSize(img.cols);
    copyMakeBorder(img, padded, 0, row - img.rows, 0, col - img.cols, BORDER_CONSTANT, Scalar::all(0));
}

void ifft_shift(Mat &mask) {
    // input sizes
    int sx = mask.cols;
    int sy = mask.rows;

    // input origin
    int cx = sx / 2;
    int cy = sy / 2;

    // split the quadrants
    Mat top_left(mask, Rect(0, 0, cx, cy));
    Mat top_right(mask, Rect(cx, 0, sx - cx, cy));
    Mat bottom_left(mask, Rect(0, cy, cx, sy - cy));
    Mat bottom_right(mask, Rect(cx, cy, sx - cx, sy - cy));

    // merge the quadrants in right order
    Mat tmp1, tmp2;
    hconcat(bottom_right, bottom_left, tmp1);
    hconcat(top_right, top_left, tmp2);
    vconcat(tmp1, tmp2, mask);
}
avi
  • 170
  • 9
  • 2
    You convert the mask image to `CV_8U`, why not create it with that type right away? I don’t know how OpenCV works with complex numbers, it seems like sometimes they’re seen as two channels? If so, casting to `CV_8U` might make a 2-channel image? Displaying the image, and its information, might help explain what is going on. – Cris Luengo Dec 04 '22 at 20:03
  • 1
    I actually didn't realize until now that the `type()` method and `convertTo()` were changing the same things. Fixed. The `complexI` matrix is indeed two channels, for real and imaginary components as per https://docs.opencv.org/2.4/doc/tutorials/core/discrete_fourier_transform/discrete_fourier_transform.html#:~:text=Transform%20the%20real%20and%20complex%20values%20to%20magnitude.%20A%20complex%20number%20has%20a%20real%20(Re)%20and%20a%20complex%20(imaginary%20%2D%20Im)%20part.%20The%20results%20of%20a%20DFT%20are%20complex%20numbers.%20The%20magnitude%20of%20a%20DFT%20is%3A – avi Dec 04 '22 at 20:12