Discard outlier SIFT Key Points in Cell Image with OpenCV

Question

I'm approaching a task of Bio Informatics, and need to extract some features from some cell images.

I used SIFT algorithm to extract Key Points inside of the image, as you can see in the picture.

enter image description here

As you can also see in the picture (circled in red), some key points are outliers and I don't want to calculate any feature on them.

I obtained the cv::KeyPoint vector with the following code:

const cv::Mat input = cv::imread("/tmp/image.jpg", 0); //Load as grayscale

cv::SiftFeatureDetector detector;
std::vector<cv::KeyPoint> keypoints;
detector.detect(input, keypoints);

but I would like to discard from the vector all those key points that, say for example, have less than 3 key points inside of a certain region of interest (ROI) centred on them in the image.

Therefore I need to implement a function returning the number of key points inside of a certain ROI given as input:

int function_returning_number_of_key_points_in_ROI( cv::KeyPoint, ROI );
   //I have not specified ROI on purpose...check question 3

I have three questions:

Is there any existing function doing something similar?
If not can you give me some help in understanding how to implement it by myself?
Would you use a circular, or rectangular ROI for this task?And how would you specify it in input?

Note:

I forgot to specify that I would like an efficient implementation for the function, i.e. checking for each key point the relative position of all others with respect to it would not be a good solution (if there exists another way of doing).

Can you post the original image? I'd like to try out something, and then post back the results if it is successful :) — mevatron, Jun 07 '12 at 14:06
@mevatron - http://s18.postimage.org/jayhj4q3d/phase1_image1.jpg here you go, I uploaded the RGB version, just convert it to grayscale if you whish....let me know what you're doing ;) — Matteo, Jun 07 '12 at 14:38
You can use RANSAC, if you can define a model. RANSAC will decide which points are inliers (fit the model) and outliers (doesn't fit the model). Maybe your model can be something like an 3 points defining an area smaller than X (it means they are close enough). It's an idea. — Jav_Rock, Jun 08 '12 at 10:59
@mevatron - Perfect!I'll wait for news, let me know in any case!and thks ;D — Matteo, Jun 08 '12 at 13:42

score 8 · Accepted Answer · answered Jun 08 '12 at 14:17

I decided to go with the statistical route, but this may not work if you have multiple cells in view.

My solution is fairly straightforward:

Compute the keypoint locations
Find the centroid of the keypoint spatial locations
Compute the Euclidean distance of all points to the centroid
Filter original keypoints by distance < mu + 2*sigma

Here is the image that I get using this algorithm (keypoints == green, centroid == red):

enter image description here

Finally, here is the code example of how I did it:

#include <opencv2/core/core.hpp>
#include <opencv2/highgui/highgui.hpp>
#include <opencv2/imgproc/imgproc.hpp>
#include <opencv2/features2d/features2d.hpp>

#include <iostream>
#include <vector>

using namespace cv;
using namespace std;

void distanceFromCentroid(const vector<Point2f>& points, Point2f centroid, vector<double>& distances)
{
    vector<Point2f>::const_iterator point;
    for(point = points.begin(); point != points.end(); ++point)
    {
        double distance = std::sqrt((point->x - centroid.x)*(point->x - centroid.x) + (point->y - centroid.y)*(point->y - centroid.y));
        distances.push_back(distance);
    }
}

int main(int argc, char* argv[])
{
    Mat input = imread("cell.jpg", 0); //Load as grayscale

    SiftFeatureDetector detector;
    vector<cv::KeyPoint> keypoints;
    detector.detect(input, keypoints);

    vector<Point2f> points;
    vector<KeyPoint>::iterator keypoint;
    for(keypoint = keypoints.begin(); keypoint != keypoints.end(); ++keypoint)
    {
        points.push_back(keypoint->pt);
    }

    Moments m = moments(points, true);
    Point2f centroid(m.m10 / m.m00, m.m01 / m.m00);

    vector<double> distances;
    distanceFromCentroid(points, centroid, distances);

    Scalar mu, sigma;
    meanStdDev(distances, mu, sigma);

    cout << mu.val[0] << ", " << sigma.val[0] << endl;

    vector<KeyPoint> filtered;
    vector<double>::iterator distance;
    for(size_t i = 0; i < distances.size(); ++i)
    {
        if(distances[i] < (mu.val[0] + 2.0*sigma.val[0]))
        {
            filtered.push_back(keypoints[i]);
        }
    }

    Mat out = input.clone();
    drawKeypoints(input, filtered, out, Scalar(0, 255, 0));

    circle(out, centroid, 7, Scalar(0, 0, 255), 1);

    imshow("kpts", out);
    waitKey();

    imwrite("statFilter.png", out);

    return 0;
}

Hope that helps!

Actually the solution you proposed is really neat and straightforward!! However, as you noticed, this may have problems when more than a cell is contained in image. In my dataset there are some bad images, but I'm trying to clean it by discarding those samples. I'll stick to this solution for now, and in case ask for further help! ;) THKS SO MUCH... — Matteo, Jun 08 '12 at 14:32
Awesome! Glad you found it useful; cool problem by the way :) I was thinking if you have multiple cells you might be able to do some type of clustering operation (K-Nearest Neighbors or something similar) as a pre-processing step, and process them separately that way. — mevatron, Jun 08 '12 at 14:37
It's a project in Bioinformatics, I need to classify the evolution of cells by analysing their morphology! And this is only the beginning ;) also the k-means idea seems really clever, I'll try it and if you're interested find some way to let you know the evolution of project. — Matteo, Jun 08 '12 at 15:14
Very cool :) Look forward to seeing the progression, and maybe more questions about it! — mevatron, Jun 08 '12 at 15:25
@mevatron: nice solution. But why you don't try ransac? I belive it could be faster, example: http://stackoverflow.com/questions/8855020/opencv-surf-and-outliers-detection — dynamic, Jul 15 '12 at 10:06
@yes123 Thanks! Well, with RANSAC, you must give it a model to use for outlier filtering. Being that I don't have a model defining the cell structure it wouldn't be very effective :-\ RANSAC is also probably quite a bit slower than my linear scan, but potentially more robust with a good model. — mevatron, Apr 25 '13 at 15:47
@mevatron Can point 3 in your suggestion be done with Hamming distance? Computing Euclidean distance is proven to be much less efficient. — rbaleksandar, May 12 '14 at 09:34

Discard outlier SIFT Key Points in Cell Image with OpenCV

1 Answers1