I'm working on a program on hololens2 research mode on unity. Hololens give us a depth image that is distance from depth sensor to object in front, for every pixel.
What I do is for every pixel I project pixel to image plane, then backproject it according to depth distance captured by depth sensor and it gives me the xyz in depth sensor coordinate frame. now it is needed to transform this coordinate to global coordinate system. to do so I get camera coordinate from unity by cam_pose = Camera.main.transform
and in the other hand saved depth sensor extrinsic matrix.
From these two matrices I create a depth_to_world = cam_pose @ inv(extrinsic)
. Now for every xyz on depth I perform global_xyz = depth_to_world @ xyz
to get point in real world. Problem is it return a point with 10-15 cm error. What am I doing wrong? (code is in python)
x = self.us[Depth_i, Depth_j] # projection from pixels to image plane
y = self.vs[Depth_i, Depth_j] # projection from pixels to image plane
D = distance_img[Depth_i, Depth_j] #distance_img is depth image
distance = 1000*float(D) / np.sqrt(x * x + y * y + 1) #distance according to spherical image plane D is in millimeter
depth_to_world = cam_pose @ np.linalg.inv(Constants.camera_extrinsic)
X = (np.array([x * distance, y * distance, 1.0 * distance, 1])).reshape(4, 1)
point = (depth_to_world @ X )[0:3, 0]