I'm trying to set up an object classification system with OpenCV. When I detect a new object in a scene, I want to know if the new object belongs to a known object class (is it a box, a bottel, something unknown, etc.).
My steps so far:
- Cutting down the Image to the roi where a new object could appear
- Calculating keypoints for every Image (cv::SurfFeatureDetector)
- Calculating descriptors for each keypoint (cv::SurfDescriptorExtractor)
- Generating a vocabulary using Bag of Words (cv::BOWKMeansTrainer)
- Calculating Response histograms (cv::BOWImgDescriptorExtractor)
- Use the Response histograms to train a cv::SVM for every object class
- Using the same set of images again to test the classification
I know that there is still something wrong with my code since the classification don't work yet.
But I don't really know, where I should use the full image (cutted down to the roi) or when I should extract the new object from the image and use just the object itself.
It's my first step into object recognition/classification and I saw people using both, full Images and extracted objects, but I just don't know when to use what.
I hope womeone can clarify this for me.