Fourier Shape Descriptors

Question

I'm looking at a paper named "Shape Based Image Retrieval Using Generic Fourier Descriptors", but only have rudimentary knowledge of Fourier Descriptors. I am attempting to implement the algorithm on page 12 of the paper, and have some results which I can't really make too much sense out of.

If I create an small image, take calculate the FD for the image, and compare the FD to the same image which has been translated by a single pixel in the x and y directions, the descriptor is completely different, except for the first entry - which is exactly the same. Firstly, a question is, is should these descriptors be exactly the same (as the descriptor is apparently scale, rotation, and translation invariant) between the two images?

Secondly, in the paper, it mentions that descriptors of two separate images are compared by a simple Euclidean distance - therefore, by taking the Euclidean distance between the two descriptors mentioned above, the Euclidean distance would apparently be 0.

I quickly put together some Javascript code to test out the algorithm, which is below.

Does anybody have any input, ideas, ways to move forward?

Thanks, Paul

    var iShape = [
     0,   0,   0,   0,   0,
     0,   0, 255,   0,   0,
     0, 255, 255, 255,   0,
     0,   0, 255,   0,   0,
     0,   0,   0,   0,   0
    ];
    
    var ImageWidth = 5, ImageHeight = 5, MaxRFreq = 5, MaxAFreq = 5;
    
    // Calculate centroid
    var cX = 0, cY = 0, pCount = 0;
    for (x = 0; x < ImageWidth; x++) {
     for (y = 0; y < ImageHeight; y++) {
      if (iShape[y * ImageWidth + x]) {
       cX += x;
       cY += y;
       pCount++;
      }
     }
    }
    
    cX = cX / pCount;
    cY = cY / pCount;
    
    console.log("cX = " + cX + ", cY = " + cY);
    
    // Calculate the maximum radius
    var maxR = 0;
    for (x = 0; x < ImageWidth; x++) {
     for (y = 0; y < ImageHeight; y++) {
      if (iShape[y * ImageWidth + x]) {
       var r = Math.sqrt(Math.pow(x - cX, 2) + Math.pow(y - cY, 2));
       if (r > maxR) {
        maxR = r;
       }
      }
     }
    }
    
    // Initialise real / imaginary table
    var i;
    var FR = [ ];
    var FI = [ ];
    for (r = 0; r < (MaxRFreq); r++) {
     var rRow = [ ];
     FR.push(rRow);
     var aRow = [ ];
     FI.push(aRow);
     for (a = 0; a < (MaxAFreq); a++) {
      rRow.push(0.0);
      aRow.push(0.0);
     }
    }
    
    var rFreq, aFreq, x, y;    
    for (rFreq = 0; rFreq < MaxRFreq; rFreq++) {
     for (aFreq = 0; aFreq < MaxAFreq; aFreq++) {
      for (x = 0; x < ImageWidth; x++) {
       for (y = 0; y < ImageHeight; y++) {
        var radius = Math.sqrt(Math.pow(x - maxR, 2) +
         Math.pow(y - maxR, 2));
        var theta = Math.atan2(y - maxR, x - maxR);
        if (theta < 0.0) {
         theta += (2 * Math.PI);
        }
        
        var iPixel = iShape[y * ImageWidth + x];
        FR[rFreq][aFreq] += iPixel * Math.cos(2 * Math.PI * rFreq *
         (radius / maxR) + aFreq * theta);
        FI[rFreq][aFreq] -= iPixel * Math.sin(2 * Math.PI * rFreq *
         (radius / maxR) + aFreq * theta);
         
       }
      }
     }
    }
    
    // Initialise fourier descriptor table
    var FD = [ ];
    for (i = 0; i < (MaxRFreq * MaxAFreq); i++) {
     FD.push(0.0);
    }
    
    // Calculate the fourier descriptor
    for (rFreq = 0; rFreq < MaxRFreq; rFreq++) {
     for (aFreq = 0; aFreq < MaxAFreq; aFreq++) {
      if (rFreq == 0 && aFreq == 0) {
       FD[0] = Math.sqrt(Math.pow(FR[0][0], 2) + Math.pow(FR[0][0], 2) /
        (Math.PI * maxR * maxR));
      } else {
       FD[rFreq * MaxAFreq + aFreq] = Math.sqrt(Math.pow(FR[rFreq][aFreq], 2) +
        Math.pow(FI[rFreq][aFreq], 2) / FD[0]);
      }
     }
    }
    
    for (i = 0; i < (MaxRFreq * MaxAFreq); i++) {
     console.log(FD[i]);
    }

Oh, the paper is at http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.63.6312&rep=rep1&type=pdf if anybody is interested. — Paul, Jan 21 '15 at 12:54

Lazaros Tsochatzidis · Answer 1 · 2015-02-23T12:54:41.697

There are three separate normalization techniques applied here in order to make the final descriptor invariant to 1) translation and 2) scale 3) rotation.

For the translation invariance part you need to find the centroid of the shape and calculate the vector of every contour point having the centroid as the origin. This is done by substracting the x and y coordinate of the centroid from each point's coordinates, respectively. So in your code the radius and theta of each point should be computes as follows:

var radius = Math.sqrt(Math.pow(x - cX, 2) + Math.pow(y - cY, 2));
var theta = Math.atan2(y - cY, x - cX);

For the scale invariance part you need to find the maximum magnitute(or radius as you say) of every vector (already normalized for translation invariance) and divide the magnitude of each point by the maximum magnitude value. An alternative way of achieving this is to divide every fourier coefficient with the zero-frequency coefficient (first coefficient) as the scale information is represented there. As I can see in you code and in the paper, this is implemented according to the second way I described.

Finally, the rotation invariance is achieved by only keeping the magnitude of the fourier coefficients as you can see in step 6 of the paper's pseudo-code.

In addition to all these, keep in mind that in order to apply the eucidean distance for the descriptor comparison, the length of the descriptor for every shape must be the same. In FFT, the number of the final coefficients depends on the number of the contour points of the shape. The solution I have found to this is to interpolate between points in order to reach a fixed number of points for every shape.

Hope I helped, Lazaros

Hi, I am also working on Fourier descriptor based shape matching. Lazaros' inputs were really useful. Previously I was following differential contour coding with respect to each coordinates, converted it to x,iy complex coordinate, took dft. When I compared with another image having same object, I get very larger Euclidean distance value. I do Gaussian blur, adaptive thresholding and dilation before finding contour. Provide some idea why i do get larger Euclidean distance for same shape. http://answers.opencv.org/question/57074/shape-matching-using-fourier-descriptor-frequency-domain/ — Spark, Mar 11 '15 at 09:27

Fourier Shape Descriptors

1 Answers1