5

I have recently implemented a recognition software following the technique described in this paper. However, my dataset also contains depth maps taken with OpenNI.

I'd like to increase robustness of the recognizer using depth information. I though about training 1-vs-all SVMs computing bow response histograms after extracting VFH descriptors (I adapted OpenCV DescriptorExtractor interface for this task). But the point is: how can I combine the two things to get more precise results? Can someone suggest me a strategy for this?

P.s. I would very much like to test the recognizer directly showing objects to a kinect (and not, like I'm doing right now, feeding cropped images to the recognizer).

Jed Fox
  • 2,979
  • 5
  • 28
  • 38
  • Are you sure using depth information would improve robustness? The paper you cite using SIFT/Bag of Visual Words as a descriptor which will result in an affine invariant system e.g. you can scale/rotate/translate the object and it will still give broadly similar descriptors and so recognise the object. If you use depth information and start tilting the object at various angles to the camera, you will get quite different signals. – jcollomosse May 11 '14 at 18:32

1 Answers1

0

I suggest you have a look at PCL, which is a framework much like opencv, only it is dedicated for point-cloud processing. It has been a while since used it, but the algorithms are other state-of-the-art implementations.

Rasmus
  • 178
  • 6