I'm trying to implement some object recognition using opencv. So far, I only used recorded data without any disturbing stuff in the scene which allowed me to focus on the recognition part with opencv. Now I realized, that this usually doesn't work for a real scenario. I want to make my algorithm work for a scene, where I place a (single) object on the floor or on a table in front of the camera and then create a mask (cv::Mat) which allows me to blank out everything but the subject on the table/floor.
Therefore I just started to have a look at the Point Cloud Library. Yet, I'm not sure how to do that or what is the best way to do it. I thought of using a plane segmentation algorithm on the Point Cloud from a Kinect camera, then remove the plane and only leave the cloud of the object behind. After that I plan to read the x and y coordinates from the corner points (outermost points) to create a mask object with opencv. Would this work? Any better ideas or maybe a source/example for a working solution?
Thanks for helping!
... The problem was that after the segmentation the point cloud is no longer organized and I did not manage to read out the (expected) x, y coordinates of the points in the cloud by using:
for (size_t i = 0; i < (cloud)->points.size(); ++i)
{
std::cout << "PointCloud x: " << cloud->points[i].x << std::endl;
std::cout << "PointCloud y: " << cloud->points[i].y << std::endl;
std::cout << "PointCloud z: " << cloud->points[i].z << std::endl;
}
This gave me some float numbers which I am not sure how to interpret. I apologize for the stupid question but what do this numbers tell? How can I get the int coordinates?
Thanks