0

I am trying to estimate the 3D position of a world coordinate from 2 frames. The frames are captured with the same camera from different positions. The problem is, the estimation is wrong.

I have

Camera Intrinsic parameters
K = [4708.29296875, 0, 1218.51806640625;
 0, 4708.8935546875, 1050.080322265625;
 0, 0, 1]

Translation and Rotation data:
Frame X-Coord    Y-Coord    Z-Coord(altitude)  Pitch          Roll            Yaw
  1   353141.23  482097.85  38.678           0.042652439    1.172694124     16.72142499
  2   353141.82  482099.69  38.684           0.097542931    1.143224387     16.79931141

Note: GPS data uses cartesian coordinate system (X,Y,Z Coordinates) is in meter units based on British National Grid GPS system.

To get the rotation matrix I used https://stackoverflow.com/a/56666686/16432598 which is based on http://www.tobias-weis.de/triangulate-3d-points-from-3d-imagepoints-from-a-moving-camera/. Using above data I calculate Extrinsic Parameters and the Projection Matrices as follows.

Rt0 = [-0.5284449976982357,  0.308213375891041,  -0.7910438668806931, 353141.21875;
       -0.8478960766271159,  -0.2384055118949635, 0.4735346398506075, 482097.84375;
       -0.04263950806535898, 0.9209600028339713,  0.3873171123665929, 38.67800140380859]

Rt1 = [-0.4590975294881605, 0.3270290779984009, -0.8260032933114635, 353141.8125;
       -0.8830316937622665, -0.2699087096524321, 0.3839326975722462, 482099.6875;
       -0.097388326965866,  0.905649640091175,   0.4126914624432091, 38.68399810791016]

P = K * Rt;

P1 = [-2540.030877954028,   2573.365272473235,  -3252.513377560185, 1662739447.059914;
      -4037.427278644764,   -155.5442017945203, 2636.538291686695,  2270188044.171295;
      -0.04263950806535898, 0.9209600028339713, 0.3873171123665929, 38.67800140380859]

P2 = [-2280.235105924588, 2643.299156802081, -3386.193495224041, 1662742249.915956;
      -4260.36781710715, -319.9665173096691, 2241.257388910372,  2270196732.490808;
      -0.097388326965866, 0.905649640091175, 0.4126914624432091, 38.68399810791016]

triangulatePoints(Points2d, projection_matrices, out);

Now, I pick the same point in both images for triangulation p2d_1(205,806) and p2d_2(116,813) For the 3D position of this particular point I expect something like; [353143.7, 482130.3, 40.80] whereas I calculate [549845.5109014747, -417294.6070425579, -201805.410744677]

I know that my Intrinsic parameters and GPS data is very accurate.

Can anybody tell me what is missing or what do I do wrong here?

Thanks

morodozan
  • 1
  • 2
  • Welcome to Stackoverflow! I am not 100% sure but this could be a unit issue. I am assuming that your camera matrix consists of pixel values while your 3d points are in meters. Thus, you are multiplying different units. You could transform both values e.g. to mm. For the camera you would have to know the pixel size in mm. – Grillteller Jul 13 '21 at 12:24
  • Thanks very much! What I have done is, I removed the rotation completely as if there is none. Then, I started to see closer values to ground truth (`ex: [658137.39, 482097.85, 60.69]` where the groud truth is `[658145.33 482135.75 40.42]`). Using mm for camera matrix as you suggested is improved the result significantly. Now, I calculate `[658142.71, 482101.83 38.74]`. I guess my next step is to include the rotations to the calculations and converting euler to rotation matrix properly [euler2Rotation](https://github.com/kam3k/euler). Any suggestions for the rotation part is appriciated. – morodozan Jul 14 '21 at 00:56

0 Answers0