Simply put, I am trying to estimate camera poses from pictures of a table with an aruco marker in the middle of it using the openCV library. 20 pictures are taken with 360/20 degree increments . As far as I understand, estimatePoseSingleMarker
gives me the pose of the marker relative to the camera. Therefore I invert the pose of the aruco marker by saying: R_cam = R_marker^T, tvec_cam = -R_marker^T*tvec_marker
where ^T
signifies the transpose operation. However, when I compare the estimated poses to the true poses (which are taken directly from the camera parameters in Blender) the cameras seem to be positioned further apart and also sit lower in the z-direction as compared to the marker. Attached is the plot showing this.Red points are the true poses and the green points are the estimated ones. The square at the bottom is simply the corners of the marker. What may be the cause of this? Perhaps loss of scale information?
Asked
Active
Viewed 194 times
0

Christoph Rackwitz
- 11,317
- 4
- 27
- 36

AlfredH
- 1
-
I would **highly recommend** labeling your 4x4 pose matrices with **both** bases, output base first, then input base, like `T_cam_marker` (pose from aruco call), which turns marker-local coordinates into camera-local coordinates (or expresses the marker's pose in camera space). -- then, you can talk of *inversion* of these 4x4 matrices, and you don't have to write such ugly expressions, and you don't have to throw around `rvec` and `tvec` separately. that practice makes the math just awful. – Christoph Rackwitz Feb 16 '22 at 11:44
-
as for your plot... scale the axes equally! that's not square. also, I think you're potentially mixing up **handedness** and what axis goes where. OpenCV and ArUco are right-handed, with X,Y being in plane (marker/screen) – Christoph Rackwitz Feb 16 '22 at 11:46
-
Thanks for your suggestions Christoph, I'll bare these tips in mind. – AlfredH Feb 18 '22 at 15:51
1 Answers
0
So the solution was pretty simple, it was a fundamental flaw in the estimation of the intrinsics matrix. The rendered images from Blender did not have a perfect pinhole camera, instead the camera I had used had different fx and fy (focal length in x and y). Simply changing these to be the same led to near perfect estimations.

AlfredH
- 1