0

I am trying to generate a point cloud using images captured by Kinect with Python and libfreenect, but I couldn't align the depth data to RGB data taken by Kinect.

I applied Nicolas Burrus's equation but the two images turned further away, is there something wrong with my code:

cx_d = 3.3930780975300314e+02
cy_d = 2.4273913761751615e+02
fx_d = 5.9421434211923247e+02
fy_d = 5.9104053696870778e+02
fx_rgb = 5.2921508098293293e+02
fy_rgb = 5.2556393630057437e+02
cx_rgb = 3.2894272028759258e+02
cy_rgb = 2.6748068171871557e+02
RR = np.array([
    [0.999985794494467, -0.003429138557773, 0.00408066391266],
    [0.003420377768765,0.999991835033557, 0.002151948451469],
    [-0.004088009930192, -0.002137960469802, 0.999989358593300 ]
])
TT = np.array([ 1.9985242312092553e-02, -7.4423738761617583e-04,-1.0916736334336222e-02 ])

# uu, vv are indices in depth image
def depth_to_xyz_and_rgb(uu , vv):

    # get z value in meters
    pcz = depthLookUp[depths[vv , uu]]

    # compute x,y values in meters
    pcx = (uu - cx_d) * pcz / fx_d
    pcy = (vv - cy_d) * pcz / fy_d

    # apply extrinsic calibration
    P3D = np.array( [pcx , pcy , pcz] )
    P3Dp = np.dot(RR , P3D) - TT

    # rgb indexes that P3D should match
    uup = P3Dp[0] * fx_rgb / P3Dp[2] + cx_rgb
    vvp = P3Dp[1] * fy_rgb / P3Dp[2] + cy_rgb

    # return a point in point cloud and its corresponding color indices
    return P3D , uup , vvp

Is there anything I did wrong? Any help is appreciated

Max One
  • 73
  • 1
  • 9

1 Answers1

1

First, check your calibration numbers. Your rotation matrix is approximately the identity and, assuming your calibration frame is metric, your translation vector says that the second camera is 2 centimeters to the side and one centimeter displaced in depth. Does that approximately match your setup? If not, you may be working with the wrong scaling (likely using a wrong number for the characteristic size of your calibration target - a checkerboard?).

Your code looks correct - you are re-projecting a pixel of the depth camera at a known depth, and the projecting it back in the second camera to get at the corresponding rgb value.

One think I would check is whether your using your coordinate transform in the right direction. IIRC, OpenCV produces it as [R | t], but you are using it as [R | -t], which looks suspicious. Perhaps you meant to use its inverse, which would be [R' | -R'*t ], where I use the apostrophe to mean transposition.

Francesco Callari
  • 11,300
  • 2
  • 25
  • 40