I have rgb image (let's call it test.png ) and corresponding 3D cloud points (extracted using stereo camera). Now, I want to use depth information to train my neural network.
Format for 3D point cloud is
.PCD v.7 - Point Cloud Data file format
FIELDS x y z rgb index
SIZE 4 4 4 4 4
TYPE F F F F U
COUNT 1 1 1 1 1
WIDTH 253674
HEIGHT 1
VIEWPOINT 0 0 0 1 0 0 0
POINTS 253674
DATA ascii
How can I extract depth information from the point cloud and instead of using rgb image I can add one more channel for depth and use RGBD image to train my network?
For example: point cloud information (FIELDS) for two pixels is given as:
1924.064 -647.111 -119.4176 0 25547
1924.412 -649.678 -119.7147 0 25548
According to the description, they're point in space that intersects that pixel (from test.png) has x, y, and z coordinates (relative to the base of the robot that was taking the images, so for our purposes we call this "global space"). (From the Cornell grasp dataset)
You can tell which pixel each line refers to by the final column in each line (labelled "index").
That number is an encoding of the row and column number of the pixel. In all of our images,
there are 640 columns and 480 rows. Use the following formulas to map an index to a row, col pair.
Note that index = 0 maps to row 1, col 1.
row = floor(index / 640) + 1
col = (index MOD 640) + 1