Understanding openCV aruco marker detection/pose estimation in detail: subpixel accuracy

Question

I am currently studying openCV's 'aruco' module, especially focusing on the poseEstimation of ArUco markers and AprilTags.

Looking into the subpixel accuracy, I have encountered a strange behaviour, which is demonstrated by the code below: If I do provide a 'perfect' calibration (e. g. cx/cy equals image center and distortion is set to zero) and a 'perfect' marker with known edge length, cv.detectMarkers will only yield the correct value, if the rotation is at 0/90/180 or 270 degrees. The subpixel routine yields an (almost) constant value for other orientations, yet at a "shifted" level. It is clear that at the specific angles of 0/90/180/270 degrees the pixel in the corner yields a sharp transition an can thus be detected with high precision. However I struggle to see where the underestimated length in all other cases originates from. Is this a bug or resulting from some trigonometry? --> Look at the graphs generated by the script below: The error in the pose results from the error in the corner detection. Thus the detection accuracy will depend on the orientation of the code.

I also checked ArUco markers and different subpixeling methods. The "peaks" remain, although the angular behaviour in between will change.

I am pretty sure, that this is not due to the interpolation related with the rotation of the marker, since I can observe the same behaviour in real data as well (yet note that the "height" of the peaks seems to depend somehow on the interpolation method. You can test this by changing the flag in cv.warpAffine e. g. to cv.INTER_LINEAR).

My questions would then be:

Are the peaks due to a bug or is this expected behaviour?
If the latter, could you help me understand why?
Is there a way to eliminate this orientation dependency of the accuracy (other than increasing the system's resolution such, that no subpixeling is required)?

EDIT: Note that the AprilTag functions have been added to openCV only recently, so you will need to upgrade to the newest version which is not yet available on some standard repositories. You can e. g. get an up-to-date version on conda-forge. /EDIT

# -*- coding: utf-8 -*-

import numpy as np
import cv2 as cv
import pylab as plt

""" generate an "ideal" calibration with zero distortion and perfect alignment 
    of the main optical axis: """
cam_matrix = np.array([[1.0e+04, 0.00000000e+00, 1.22400000e+03],
       [0.00000000e+00, 1.0e+04, 1.02400000e+03],
       [0.00000000e+00, 0.00000000e+00, 1.00000000e+00]])
dist_coeffs = np.array([[0.],
       [0.],
       [0.],
       [0.],
       [0.],
       [0.],
       [0.],
       [0.],
       [0.],
       [0.],
       [0.],
       [0.],
       [0.],
       [0.]])

# define detection parameters
marker_length = 30.00 # some arbitrary value
marker_length_px = 700
marker_id = 3
dictionary = cv.aruco.getPredefinedDictionary(cv.aruco.DICT_APRILTAG_16H5)
para = cv.aruco.DetectorParameters_create()
para.cornerRefinementMethod = cv.aruco.CORNER_REFINE_APRILTAG
para.aprilTagDeglitch = 0           
para.aprilTagMinWhiteBlackDiff = 30
para.aprilTagMaxLineFitMse = 20
para.aprilTagCriticalRad = 0.1745329201221466 *6
para.aprilTagMinClusterPixels = 5  
para.maxErroneousBitsInBorderRate = 0.35
para.errorCorrectionRate = 1.0                    
para.minMarkerPerimeterRate = 0.05                  
para.maxMarkerPerimeterRate = 4                  
para.polygonalApproxAccuracyRate = 0.05
para.minCornerDistanceRate = 0.05

marker_length_list = []
tvec_z_list = []

# generate pictures (AprilTag ID: 3 centered in image will be rotated by fixed angular steps, e. g. 10 degrees) 
degrees_list = np.linspace(0,350,36, dtype=np.int).tolist()
marker = cv.aruco.drawMarker(dictionary, marker_id, marker_length_px)
img = np.zeros((2048, 2448), np.uint8)+255
img[674:1374, 874:1574] = marker
cv.imshow("Original", img)
cv.imwrite("original.png", img)
rows, cols = img.shape

for entry in degrees_list:
    # rotate original picture
    rot_mat = cv.getRotationMatrix2D((((rows-1)/2),(cols-1)/2), entry, 1)
    rot_img = cv.warpAffine(img, rot_mat, (cols, rows), flags=cv.INTER_CUBIC) # interpolation changes the "peak amplitude" e.g. try cv.INTER_LINEAR instead 
    # detect marker and get pose estimate
    corners, ids, rejected = cv.aruco.detectMarkers(rot_img,dictionary,parameters=para)
    my_index = ids.tolist().index([marker_id])
    fCorners = corners[my_index]
    fRvec,fTvec, _obj_points = cv.aruco.estimatePoseSingleMarkers(fCorners, marker_length, cam_matrix, dist_coeffs)
    # calculate the respective edge length for each side
    L1 = abs(np.sqrt(np.square(fCorners[0][0][0]-fCorners[0][1][0])+np.square(fCorners[0][0][1]-fCorners[0][1][1])))
    L2 = abs(np.sqrt(np.square(fCorners[0][0][0]-fCorners[0][3][0])+np.square(fCorners[0][0][1]-fCorners[0][3][1])))
    L3 = abs(np.sqrt(np.square(fCorners[0][2][0]-fCorners[0][1][0])+np.square(fCorners[0][2][1]-fCorners[0][1][1])))
    L4 = abs(np.sqrt(np.square(fCorners[0][2][0]-fCorners[0][3][0])+np.square(fCorners[0][2][1]-fCorners[0][3][1])))
    mean_length = (L1+L2+L3+L4)/4
    # append results
    marker_length_list. append(mean_length)
    tvec_z_list.append(fTvec[0][0][2])
        
plt.figure("TVEC Z")
plt.plot(degrees_list, tvec_z_list, "go--")
plt.xlabel("marker rotation angle (°)")
plt.ylabel("TVEC Z (units of length)")

plt.figure("Mean marker length (should be 700)")
plt.plot(degrees_list, marker_length_list, "bo--")
plt.xlabel("marker rotation angle (°)")
plt.ylabel("marker length (pixels)")

EDIT2: As suggested by Christoph Rackwitz, here is the output generated by the script as pictures:

Marker length: Should ideally be 700:

Marker distance: Should be invariant of rotation:

if you've got graphs you want people to look at, perhaps add pictures of them to your question. — Christoph Rackwitz, Aug 02 '22 at 19:00
Take a closer look at the code: The graphs get generated by the snippet ;-) — chris, Aug 03 '22 at 20:11
you probably already answered your question or no longer have the need. it's been 2.5 years since you asked. I understand if this question is no longer relevant. -- pictures get the point across easier and faster than text. you, asking a question, compete for attention against others. it would be in anyone's interest to make a question catchy, rather than a bear to read. 85 lines of dense code aren't enjoyable. nor should you assume that anyone would copy-paste your code and run it. — Christoph Rackwitz, Aug 03 '22 at 22:01
Though I am no longer working on that project, I would still like to know, what exactly causes this "modulation". As the question has now more than 3k views, it might also be valuable for the rest of the community. I have added the graphs as requested. Any ideas about the origin of the features are welcome. — chris, Aug 06 '22 at 10:59
I hate matplotlib plots for being so deceptive. I kept wondering why the first plot shows _negative_ lengths but it's *700 plus* something... -- you're getting a mere 2% of a pixel of difference (30 ppm in length, 30 ppm in Z). warpAffine uses fixed point math, 5 fractional bits. also: cubic interpolation. also: corner refinement method (that hopefully works on the edges, not the corners)... and this here is all happening in a "linear colorspace" (the warpaffine). if you gave this real pictures, those are gamma-mapped, and webcams apply sharpening filters, so you'll get even wilder results — Christoph Rackwitz, Aug 06 '22 at 11:09
I am also interested in this error. Thanks for posting the graphs! But in the first graph, the label for the vertical axis seems to be cut off. And the axis may be mislabeled in the example code, I suspect the label for that axis is probably meant to be "difference from 700". But if that's the case, it would probably be better to make the graph with the axis as "calculated marker length", and then show it with the range 699.99 - 700.01. Assuming I am reading the charts correctly. — user3685427, Aug 26 '22 at 15:44
I did some cleanup to the weird matplotlib formatting... I do hope readability has improved now. — chris, Oct 08 '22 at 11:50
I do think what Christoph Rackwitz pointed out is the correct answer: When the pixel pattern perfectly aligns with the contours, there is no rounding involed. So this is just an effect of computational precision. @ChristophRackwitz: If you add this as an answer, I can mark it as accepted. Thank you for all the feedback. — chris, Oct 08 '22 at 11:58

score 0 · Accepted Answer · edited Feb 07 '23 at 15:01

Answer by Christoph Rackwitz (see comments):

you're getting a mere 2% of a pixel of difference (30 ppm in length, 30 ppm in Z). warpAffine uses fixed point math, 5 fractional bits. also: cubic interpolation. also: corner refinement method (that hopefully works on the edges, not the corners)... and this here is all happening in a "linear colorspace" (the warpaffine). if you gave this real pictures, those are gamma-mapped, and webcams apply sharpening filters, so you'll get even wilder results

Understanding openCV aruco marker detection/pose estimation in detail: subpixel accuracy

1 Answers1

Linked