Create mask (cv::Mat) from PointCloud to extract object from scene observed with kinect

Question

I'm trying to implement some object recognition using opencv. So far, I only used recorded data without any disturbing stuff in the scene which allowed me to focus on the recognition part with opencv. Now I realized, that this usually doesn't work for a real scenario. I want to make my algorithm work for a scene, where I place a (single) object on the floor or on a table in front of the camera and then create a mask (cv::Mat) which allows me to blank out everything but the subject on the table/floor.

Therefore I just started to have a look at the Point Cloud Library. Yet, I'm not sure how to do that or what is the best way to do it. I thought of using a plane segmentation algorithm on the Point Cloud from a Kinect camera, then remove the plane and only leave the cloud of the object behind. After that I plan to read the x and y coordinates from the corner points (outermost points) to create a mask object with opencv. Would this work? Any better ideas or maybe a source/example for a working solution?

Thanks for helping!

... The problem was that after the segmentation the point cloud is no longer organized and I did not manage to read out the (expected) x, y coordinates of the points in the cloud by using:

 for (size_t i = 0; i < (cloud)->points.size(); ++i)
              {
                 std::cout << "PointCloud x: " <<  cloud->points[i].x << std::endl;
                 std::cout << "PointCloud y: " <<  cloud->points[i].y  << std::endl;
                 std::cout << "PointCloud z: " <<  cloud->points[i].z  << std::endl;
              }

This gave me some float numbers which I am not sure how to interpret. I apologize for the stupid question but what do this numbers tell? How can I get the int coordinates?

Thanks

What is the object recognition algorithm that you are using with OpenCV and are you wed to it? This will probably determine your segmentation approach. — D.J.Duff, Apr 28 '14 at 06:03
I extract SIFT features from a rotating object (from all sides) so that I can identify it from any view when I see it again. To avoid that I also store keypoints from background scenes I want to apply a mask with opencv. — user2746420, Apr 28 '14 at 19:11

D.J.Duff · Answer 1 · 2014-04-29T19:14:38.787

Assuming that your object recognition algorithm is fixed, the following is a typical approach to segmenting an object above a plane. In your case I would suggest you maintain a set of indices into the original cloud so that you can easily create the mask at the end; Point Cloud Library generally gives this facility (alternatively you can forget about keeping track of indices and, once you have the final point cloud of your object, use nearest-neighbour searches to find the indices of the original - less efficient).

The steps:

Plane-finding to find and extract the plane.

http://pointclouds.org/documentation/tutorials/planar_segmentation.php#planar-segmentation http://pointclouds.org/documentation/tutorials/extract_indices.php#extract-indices
Remove noise:

http://pointclouds.org/documentation/tutorials/statistical_outlier.php#statistical-outlier-removal
Euclidean clustering to find the object in the scene after the plane has been removed:

http://pointclouds.org/documentation/tutorials/cluster_extraction.php#cluster-extraction

After you have done this you will have the set of indices of points from the original cloud that are in the segmented object. Note that the original cloud will be organised so it will be easy to get the column/row of each point. Now you can use that to create a mask, possibly do some dilation (erosion) to remove noise.

More information: http://www.pointclouds.org/assets/iros2011/segmentation.pdf

Note: if you do not have to do it using separate segmentation and recognition steps there are many algorithms that do them at the same time, one of which is LINEMOD which is in both PCL and OpenCV.

Edit

You have written that you need help in finding the indices of the original organised point cloud.

I wrote that there are 2 ways to do this - get the output of each step as points and do a search into the original cloud to find the indices or only deal with indices at each step. A third option is to leverage those algorithms which have been implemented to have the facility to leave cloud organised (I am not sure which ones give this option). Let's deal with the second option.

If you see the documentation of the classes used above, pcl::SACSegmentation, pcl::StatisticalOutlierRemoval and pcl::EuclideanClusterExtraction, you will see that there are facilities to set indices on the input. This is analogous to setting a mask in OpenCV. This is generally a call to a function .setIndices() that accompanies .setInputCloud(). Similarly, if you want the algorithm to output a set of indices rather than a new cloud you can pass an indices object to be filled (this is either std::vector or pcl::PointIndices which is pretty much the same thing - check the struct definition).

So if you just use the original point cloud pass indices through the pipeline you can get a set of indices at the end. To convert these indices to a mask, you can calculate the x and y value from each index (as far as I recall the organised point cloud is packed in the two dimensions with row major so this is integer division by width to get the y and integer modulo by width to get the x).

http://pointclouds.org/documentation/tutorials/basic_structures.php#basic-structures

Thanks for the step-by-step description. I tried to do as you proposed but encountered some problems. I updated my question... — user2746420, Apr 28 '14 at 17:58

Create mask (cv::Mat) from PointCloud to extract object from scene observed with kinect

1 Answers1

Edit