5

I'm coding a program in OpenCV, which is supposed to detect objects in a scene,namely products in a supermarket.

I plan to use SURF descriptors for this purpose, however everything I've found so far is related to finding 1 occurrence of an object in a scene (generally with nearest neighbor matching) and I've found absolutely nothing about detecting objects in a scene with multiple instances of the same object (NN obviously doesn't work, since the best match for each point may be in different instances).

I also need to use a classifier, like Artificial Neural Networks, which could be more helpful in finding multiple instances of the object, however I don't understand how to use an ANN (or any other classifier) with keypoints.

Should I use the ¿64? values of each SURF point as the input of the ANN, and each of, say 5 products, as the output? Meaning that all the points (which are not similar) within one object would produce the same output.

I've read that that's the way to go, but I don't see how it could work since all the keypoints in one object may (and should) have different characteristics. But I can't think of any other way to do it.

Sorry if I haven't explained it very well, I'll try to clarify if something's not clear enough.

Gerardo Galarza
  • 167
  • 1
  • 2
  • 12
  • Since apparently finding several instances of the same object isn't possible, let me ask a simpler question. When using SURF, what should I use as training data for an ANN or a SVM? each keypoint would be a training data? and the output would be the label of the image containing said points? – Gerardo Galarza Jun 28 '13 at 20:34

1 Answers1

3

I had a similar problem. What I have done is the following:

  • Use sliding window. Sweeps with ROI of various size in the whole image. The size of the ROI should be more or less of the size of the expected object.
  • For each patch, detect the features and does the matching. If a object is detected, set the region to zero in the main image.
  • Go to next patch and repeat.

But it can be a bit slow with SURF (if you have a lot of ROIs to sweep), so I used FAST feature detector and BRISK descriptor to speed up the process. It worked well.

cyberdecker
  • 574
  • 12
  • 24
  • Thanks a lot for the answer. I think I'll go with the sliding window, but I was trying to avoid it because I can't be sure of the object size. I'm not sure I understand the second pint, do you mean setting the pixels of the matching patch to zero in the image? How did you do the match? Nearest neighbor? do you know a way to match them through support vector machines, neural networks or other classifiers? – Gerardo Galarza Jun 28 '13 at 20:58
  • For the sliding window, try with various sizes. For the matching, I used a RobustMatcher class from the book OpenCV2 Computer Vision Application Programming Cookbook. It does a robust matching. You can get the code here: https://code.google.com/p/opencv-cookbook/ (chapter 09 > matcher.h) – cyberdecker Jun 28 '13 at 21:23
  • I don't know how to do with NN or SVM, but I think with the matcher that I mentioned and with some filtering (for example, looks if the homography matrix make sense, if warping perspective get a reasonable rectangle with expected ratio of width and height, etc) you can get nice detections. – cyberdecker Jun 28 '13 at 21:26
  • Ok, will try to do it that way. Finally, you moved the windoow a few pixels every iteration or moved it in steps equal to its width? – Gerardo Galarza Jun 28 '13 at 21:44
  • I moved using steps equal to half of the width. In my case, if I moved in width, sometimes the object would be cut in half (half of the object in a window and another half in the other window). So using step = width/2 improves the chance of detecting a whole object. But it can vary, you can tweak the step value to get the best value possible in this situation. – cyberdecker Jun 28 '13 at 21:53
  • can you provide me code for detecting multiple instances of a object in a image? –  Ritik Kumar Agrahari May 05 '17 at 07:30