0

The problem I need to solve is the "matching" of a smaller set of images to a bigger one. More than that I want to rank their similarity.

The first think that came to my head was to use sift features and found out the vl_sift function did the job really well. In the same library I was able to use vl_ubcmatch to get the matching key points between two images. My problem now is getting a criteria to rank the similarity between images and a good strategy to apply such methods to the whole data base.

Can you help me out?

Note1: The images I'm going to apply this to are taken from an inboard camera in a vehicle that did some trips around town and acquired images with a framerate of 1 image/sec. The definition of "similarity" I would like to use is to attribute a high rank of similarity to images of the same locations. If I define some known locations around town as A, B, C and D, what I want to achieve with this algorithm is to find what images, in the whole set of pictures taken, are from those locations A, B, C and D.

Note2: I'm using matlab.

nVolteX
  • 1
  • 1
  • You could look into [deep belief networks](http://www.cs.toronto.edu/~hinton/nipstutorial/nipstut3.pdf). They do a good job of whittling the data down into principal features so that high-dimensional data can be described in just a handful of dimensions. – nispio Oct 19 '13 at 19:34
  • Can you upload some example images? – GilLevi Oct 19 '13 at 22:35
  • It depends mainly on how YOU define the similarity you want to catch. Which features you want to compare- color, contarst, smoothing,...? – Adiel Oct 20 '13 at 06:15
  • The images, as i said, are the ones taken from a camera inside a vehicle doing some trips around town, so they are pictures of various locations throughout the town. Similar images will be images of the same locations (even if with some slightly different point of view). Let's imagine that the car passes by a well known statue in town 2 times, i want those pictures taken around the statue to have a "high degree of similarity" attributed to them. I don't know if I'm explaining myself properly, so if you need any more info in order to understand it better say it ;) – nVolteX Oct 20 '13 at 09:45
  • You can try to match keypoints between the images and try to solve a homography (or affine transformation) between the images. The similarity can be the percent of keypoints that correspond or whether or not you manage to get the matches agree on a homography/affine transformation. Please tell me if it makes sense to you or you have any questions about this idea. – GilLevi Oct 20 '13 at 16:14
  • The part of solving a homography between the images in order to verify its similarity I wasn't able to understand (it's my first time working with computer vision) and the other solution I had already thought about it. But imagine that I have 2 series of images with sizes m and n (m>n) and I want to match the sequence of size n to images in the one of size m, wouldnt I have to make (n*m) comparisons? Is there a more efficient way? And one of my questions about using the percent of keypoints that correspond, had to do to how good of a measure it was. By the way, thanks for everything ;) – nVolteX Oct 20 '13 at 22:28
  • Using homographies will not work for images that have been taken from different view points showing 3D data. If you are taking pictures of large planes, e.g., walls or the ground then the approach using homographies would work (inside a RANSAC loop of course). I think you should take a look at structure from motion approaches from internet photo collections such as 'Towards Linear-time Incremental Structure from Motion' from C. Wu, 'Photo tourism: Exploring photo collections in 3D' from N. Snavely et. al., and 'Visual Modeling with a Hand-Held Camera' by M. Pollefeys et. al. – who9vy Oct 21 '13 at 19:52
  • Well, I have used vl_sift to compute the key_points of each image and vl_ubcmatch, to compare images. I thought that using the "number of matches"/"total number of key_points" on a image would be a good measure of the similarity of the images, however I rarely get over 65% matching points even when the images are fairly similar. On the other hand, my ranking needs to be between 0 (not similar at all) and 1 (very very similar), does anyone has a idea how to go about this? – nVolteX Oct 22 '13 at 16:46
  • have you considered [geometric hashing](https://dspace.cvut.cz/bitstream/handle/10467/9547/2009-Geometric-min-hashing-Finding-thick-needle-in-haystack.pdf?sequence=1)? – Shai Nov 05 '13 at 15:14

1 Answers1

0

You can use "Bag-of-words" approarch. It is well described in Sivic and Zisserman paper "Video Google: Efficient Visual Search of Videos"

old-ufo
  • 2,799
  • 2
  • 28
  • 40