I want to merge point clouds from several depth cameras together. In order to achieve this, I am trying to convert the point cloud from each camera frame to a single world frame, defined by an aruco code (similar to a QR code).
Right now, I am just starting with a single camera.
Let's say I have
k = #camera intrinsic matrix (3x3)
aruco_world_coords = #coordinates (meters) of the aruco code corners in the real world (4x3)
e.g. [[0, 0, 0],
[.1, 0, 0],
[.1, -.1, 0],
[0, -.1, 0]]
rgb_image = #camera's color image (width x height x 3)
cam_pc = #point cloud in camera's frame (nx3)
I currently do the following:
import cv2 as cv
import numpy as np
from cv2 import aruco
#######################################################################
##### Detect Aruco code and extract rvec and tvec ######
#######################################################################
gray = cv.cvtColor(rgb_image, cv.COLOR_BGR2GRAY)
aruco_dict = aruco.Dictionary_get(aruco.DICT_4X4_50)
params = aruco.DetectorParameters_create()
corners, _, _ = aruco.detectMarkers(gray, aruco_dict, parameters=params)
rvec, tvec = cv.solvePnP(QR_world_coords,
corners[0],
K, None, None, None,
False, cv.SOLVEPNP_ITERATIVE)
rmatrix, _ = cv.Rodrigues(rvec)
#######################################################################
##### Calculate new point cloud ######
#######################################################################
new_pc = np.matmul(np.linalg.inv(rmatrix), cam_pc.T-tvec).T
#######################################################################
##### Same math in a different way to verify result #####
#######################################################################
'''
t = -rmatrix.T.dot(tvec)
rmatrix_ = rmatrix.T
new_pc = cam_pc.dot(rmatrix_.T) + tvec.squeeze()
'''
#######################################################################
##### Same math, in yet another way, to be super sure it's right #####
#######################################################################
'''
m = np.hstack([rmatrix, trans])
m = np.vstack([m, np.array([[0,0,0,1]])])
m = np.linalg.inv(m)
cam_pc_homog = np.concatenate([cam_pc, np.ones((pc.shape[0],1))], axis=1)
new_pc = np.matmul(m, cam_pc_homog.T).T
new_pc /= new_pc[:,3].reshape(-1,1)
new_pc = new_pc[:,:3]
'''
I then display the new point cloud in a scene with axis drawn at the origin (0,0,0).
I now expect that when I point the camera at the Aruco code, the points should shift, and indeed they do.
However I also expect that the point cloud should now remain stationary, even as I rotate the camera, since the points have been transformed into the world frame, and the world is not moving relative to the world frame. This does not happen, the points continue to move (albeit in a different way) as I move the camera around.
I believe the rotation and translation are correct, I use them to display axes onto the aruco code in the rgb_image, and they are in exactly the right place.
Any help would be greatly appreciated!