1

I have two images: a large image of a puppy (the scene) and a small crop of its nose (the target):

puppy nose

I have gathered SURF points for both the target and the scene and then matched them. I have displayed the best matches on the scene image as follows:

matching SURF points

What is the best way to identify the largest cluster of matched surf points?

In this example, all the points are well clustered. However, in some other examples, there are several outliers which I want to exclude.

Thanks.

Update: KDE worked well for me. Thanks everyone, that's great.

Chris Parry
  • 2,937
  • 7
  • 30
  • 71
  • how about maybe taking the median of x and y coordinates as the center of your cluster and include: either all points within a certain radius, or some percentage of the closest point? – gregswiss Sep 20 '15 at 04:14
  • I think that would work in the puppy example (which was poorly chosen by me). I have provided an update below: a better example, with more typical outliers for my problem set. Thanks! – Chris Parry Sep 20 '15 at 04:22
  • Please update the question, not an answer. Have a look at **density estimation** instead of clustering. – Has QUIT--Anony-Mousse Sep 20 '15 at 17:09

2 Answers2

1

You don't need cluster analysis.

What you want to find is the area of highest density. That is probably where the true match is.

There are many methods of density estimation, in particular for low dimensionality. Consider Kernel Density Estimation KDE if you can afford it.

If you cannot afford density estimation, but need something very fast, try this:

  1. In each dimension, compute the median.

  2. Combine the medians into a vector to use as estimation.

The median is more robust to outliers than the mean, but other than that, it's essentially taking the (robust) mean of all your points. This will be okay, unless there are multiple good matches in the same image. Then density estimation as discussed above will be better.

Has QUIT--Anony-Mousse
  • 76,138
  • 12
  • 138
  • 194
0

What you are basically looking for is the mode of the density function induced by the matched points. That is, in the 2D space of the image, each matched point represents a "sample" from a probability function that matches between the image and the "target". You are looking for the point in 2D where this density function has a "peak": that is the center where most matches are.

There is a well known algorithm to find modes of density functions given samples from the function, it is called Mean shift. Applying mean-shift to the XY coordinates of the matches you have in the image (I would use a circular or a triangle kernel with size proportional to the size of the "target") should result with the coordinates of the center of the "target" located in the image.

A quick search in google suggests this implementation for mean shift clustering.

Note: please do not get confused with Comaniciu and Meer's mean shift segmentation.

Shai
  • 111,146
  • 38
  • 238
  • 371