How do I transform a point cloud from the camera frame to the world frame using QR/Aruco code (Python CV)?

Question

I want to merge point clouds from several depth cameras together. In order to achieve this, I am trying to convert the point cloud from each camera frame to a single world frame, defined by an aruco code (similar to a QR code).

Right now, I am just starting with a single camera.

Let's say I have

k = #camera intrinsic matrix (3x3)
aruco_world_coords = #coordinates (meters) of the aruco code corners in the real world (4x3) 
                     e.g. [[0,    0,  0],
                           [.1,   0,  0],
                           [.1, -.1,  0],
                           [0,  -.1,  0]]
rgb_image = #camera's color image (width x height x 3)
cam_pc = #point cloud in camera's frame (nx3)

I currently do the following:

import cv2 as cv
import numpy as np
from cv2 import aruco

#######################################################################
#####          Detect Aruco code and extract rvec and tvec       ######
#######################################################################

gray = cv.cvtColor(rgb_image, cv.COLOR_BGR2GRAY)
aruco_dict = aruco.Dictionary_get(aruco.DICT_4X4_50)
params =  aruco.DetectorParameters_create()

corners, _, _ = aruco.detectMarkers(gray, aruco_dict, parameters=params)

rvec, tvec = cv.solvePnP(QR_world_coords, 
                            corners[0], 
                            K, None, None, None,
                            False, cv.SOLVEPNP_ITERATIVE)

rmatrix, _ = cv.Rodrigues(rvec)

#######################################################################
#####                  Calculate new point cloud                 ######
#######################################################################

new_pc = np.matmul(np.linalg.inv(rmatrix), cam_pc.T-tvec).T

#######################################################################
#####           Same math in a different way to verify result     #####
#######################################################################

'''
t = -rmatrix.T.dot(tvec)
rmatrix_ = rmatrix.T
new_pc = cam_pc.dot(rmatrix_.T) + tvec.squeeze()
'''

#######################################################################
##### Same math, in yet another way, to be super sure it's right  #####
#######################################################################

'''
m = np.hstack([rmatrix, trans])
m = np.vstack([m, np.array([[0,0,0,1]])])
m = np.linalg.inv(m)

cam_pc_homog = np.concatenate([cam_pc, np.ones((pc.shape[0],1))], axis=1)
new_pc = np.matmul(m, cam_pc_homog.T).T
new_pc /= new_pc[:,3].reshape(-1,1)
new_pc = new_pc[:,:3]
'''

I then display the new point cloud in a scene with axis drawn at the origin (0,0,0).

I now expect that when I point the camera at the Aruco code, the points should shift, and indeed they do.

However I also expect that the point cloud should now remain stationary, even as I rotate the camera, since the points have been transformed into the world frame, and the world is not moving relative to the world frame. This does not happen, the points continue to move (albeit in a different way) as I move the camera around.

I believe the rotation and translation are correct, I use them to display axes onto the aruco code in the rgb_image, and they are in exactly the right place.

Any help would be greatly appreciated!

make sure all points are column vectors. make sure everything is 4x4 (matrices) or 4x1 (vectors). that's the math convention. make sure you matrix-multiply (NOT elementwise-multiply) your matrices, if you do that. make sure to multiply matrix and vector, not vector and matrix (order). make sure to invert or not invert when appropriate. don't bother with that "funky" matrix inversion (`-rmatrix.T.dot(tvec)`). just compose R and T and [0,0,0,1], then invert the whole thing. use hstack, vstack, or np.block. I don't know if `concatenate` might do something unexpected. — Christoph Rackwitz, Jul 30 '21 at 16:43
I forgot to reply to this at the time, but this was a very helpful sanity check :) Thanks! I've since gotten it working. — wittyUsername, Oct 07 '21 at 15:24

How do I transform a point cloud from the camera frame to the world frame using QR/Aruco code (Python CV)?

0 Answers0