1

I am trying to implement what is stated in a research paper. It describes how to extract Fourier features from images. I tried to follow the steps while coding but repeatedly faced errors related to datatypes and dimensions of the input array. So I ask that how to input complex values to the function

I have followed the following instructions from the research paper

Fourier Descriptors: Fourier descriptors provide a way to encode an image boundary by mapping every pixel position ( x , y ) into a complex number ( x + i y ).

  1. Record the coordinate values of each pixel sequentially (moving clockwise along the shape)
  2. Construct a complex-valued vector using coordinate values recorded in step 1 i.e., ( x , y ) → ( x + i y ) .
  3. Take DFT of the complex-valued vector

My problem comes at step 3

This is my implementation

def get_dft(image):
    coordinates = cv.findNonZero(image)
    # the code below removes an unnecessary dimension
    coordinates = coordinates.reshape(coordinates.shape[0], 2)
    y = coordinates[:, 1] * 1j  # convert to complex numbers
    # the code below removes an unnecessary dimension
    y = y.reshape(coordinates.shape[0], 1)
    x = coordinates[:, 0].reshape(coordinates.shape[0], 1)
    # the statement below will convert from two separate arrays
    # to a single array with each element  
    # of the form [a + jb]
    t = x + y
    # below is where the error occurs
    dft = cv.dft(t, flags=cv.DFT_COMPLEX_INPUT) 

This is the error I get

TypeError: Expected cv::UMat for argument 'src'

when I convert as

a = numpy.ndarray(t)

I get

ValueError: sequence too large; cannot be greater than 32

It wants to say there are greater than 32 dimensions. I don't understand why that happens

and When I try as

a = numpy.ndarray([t])

I get the error

TypeError: only integer scalar arrays can be converted to a scalar index

In short I want to follow the steps as mentioned in the paper, make a complex valued vector like

[[a+jb],[c+jd]...]    

and pass it to the DFT function.

a_guest
  • 34,165
  • 12
  • 64
  • 118
Vishwad
  • 251
  • 3
  • 18

1 Answers1

0

I found a solution to the problem as

def get_dft(image):
    coordinates = cv.findNonZero(image)
    coordinates = coordinates.reshape(coordinates.shape[0], 2).astype(float)
    y = coordinates[:, 1].reshape(coordinates.shape[0], 1)
    x = coordinates[:, 0].reshape(coordinates.shape[0], 1)
    t = cv.merge([x, y])  # used to convert to 2 channel
    dft = cv.dft(t, flags=cv.DFT_COMPLEX_INPUT)

I tried all that numpy api and all that seemed to fail for reasons I don't understand, but fortunately the OpenCV one

cv.merge(...)

worked.

It takes multiple input arrays and joins to make a multi channel output.

Also I tried inputting complex numbers to the OpenCV API function

cv.dft(...)

it wasn't the correct way of inputting complex numbers. OpenCV documentation explains complex input here

It states that the flag, cv.DFT_COMPLEX_INPUT

specifies that input is complex input. If this flag is set, the input must have 2 channels. On the other hand, for backwards compatibility reason, if input has 2 channels, input is already considered complex

Note the the problem I also faced was converting to two channel, happened due to the fact the I had not properly understood the the structure cv::UMat(), that is required as input to the function.

The summary is,
If you want to input complex numbers into the OpenCV API function

cv.dft(...)

your input must consist of 2 channels, to accomplish making a two channel array, the OpenCV function,

cv.merge(...)

link to its documentation, seems to get the job done right when you are trying to combine multiple individual channels.

Vishwad
  • 251
  • 3
  • 18