I'm using a Numpy implementation of camera calibration by direct linear transformation (DLT) in python. I'm trying to use it for 3 dimensional camera calibration. My problem is, the mean error of the DLT (mean residual of the DLT transformation in units of camera coordinates) is very high in the example, in the thousands of pixels especially compared to the examples provided by the original author (see here).
These are the 3D points I use:
objpoints = [[86.438, -174.922,51.316],[-27.519,-215.460,39.154],
[73.601, 107.800,120.455],[87.602,133.413,34.023],
[101.276,-55.204,108.884],[88.509,-68.038,116.634],
[27.518,-215.460,39.154],[-31.355,-207.334,85.184],
[87.601,-131.059,33.881],[-60.234,-23.833,148.269],[62.162,-23.042,148.715]]
These are the pixels I use:
imgpoints = [[576.0,861.0],[660.0,996.0],[253.0,1383.0],[575.0,1481.0],
[276.0,1217.0],[241.0,1139.0],[665.0,461.0],[231.0, 411.0],
[660.0,226.0],[141.0,684.0],[111.0,1123.0]]
I extracted these points manually, for 3D from a point cloud model (.ply format) and for matching 2D image by pixels.
Something must be wrong with my coordinates at a very basic level, but I'm not sure what it is and how to find it.