I'm trying to implement a task for the robotic class of my university. The main task of the project it's to grasp a cup with the KUKA-youbot robot.
The main problem that i found it's about the part of vision computing of the project. More precisely, i'm using a Kinect (which will be mounted on the robot arm) and in the beginning i thought to use this procedure to find the cup:
- take a picture of the cup (really close to it) before to start
- with openCV calculate the keypoints of the image and the keypoints of the real-time image toke in real-time from the kinect mounted on the robot arm.
- At this point i can map with OpenCV the keypoints between the two images.
- Map the 2d points of the image in real-time with a 3d point-cloud image take from kinect in the same moment. Calculate the centroids in the 3d image, and so know the position of the cup.
The problem is that if the real-time image is toke close (like 1 meter) the keypoints exctaction and mapping them with the other works good, otherwise specially if in the scene there are other object, the openCV algorithm detect other important features and the mapping don't work.
I'm using this: http://docs.opencv.org/doc/tutorials/features2d/feature_homography/feature_homography.html
As you can see in that two pics in the first the robot it's near the cup and all went good, http://postimg.org/image/byx8danpt/ here we are far by the cup and nothing works. http://postimg.org/image/9lhhzxk4z/
I want to know if there are other methods more functionally for my project. Maybe something model based and not features based like my initial idea.
Thanks Luca