5

I am using this legacy code: http://fossies.org/dox/opencv-2.4.8/trifocal_8cpp_source.html for estimating 3D points from the given corresponding 2D points from 3 different views. The problem I faced is same as stated here: http://opencv-users.1802565.n2.nabble.com/trifocal-tensor-icvComputeProjectMatrices6Points-icvComputeProjectMatricesNPoints-td2423108.html

I could compute Projection matrices successfully using icvComputeProjectMatrices6Points. I used 6 set of corresponding points from 3 views. Results are shown below:

projMatr1 P1 = 
[-0.22742541, 0.054754492, 0.30500898, -0.60233182;
  -0.14346679, 0.034095913, 0.33134204, -0.59825808;
  -4.4949986e-05, 9.9166318e-06, 7.106331e-05, -0.00014547621]

projMatr2 P2 = 
[-0.17060626, -0.0076031247, 0.42357284, -0.7917347;
  -0.028817834, -0.0015948272, 0.2217239, -0.33850163;
  -3.3046148e-05, -1.3680664e-06, 0.0001002633, -0.00019192585]

projMatr3 P3 = 
[-0.033748217, 0.099119112, -0.4576003, 0.75215244;
  -0.001807699, 0.0035084449, -0.24180284, 0.39423448;
  -1.1765103e-05, 2.9554356e-05, -0.00013438619, 0.00025332544]

Furthermore, I computed 3D points using icvReconstructPointsFor3View. The six 3D points are as following:

4D points = 
[-0.4999997, -0.26867214, -1, 2.88633e-07, 1.7766099e-07, -1.1447386e-07;
  -0.49999994, -0.28693244, 3.2249036e-06, 1, 7.5971762e-08, 2.1956141e-07;
  -0.50000024, -0.72402155, 1.6873783e-07, -6.8603946e-08, -1, 5.8393886e-07;
  -0.50000012, -0.56681377, 1.202426e-07, -4.1603233e-08, -2.3659911e-07, 1]

While, actual 3D points are as following:

   - { ID:1,X:500.000000, Y:800.000000, Z:3000.000000}
   - { ID:2,X:500.000000, Y:800.000000, Z:4000.000000}
   - { ID:3,X:1500.000000, Y:800.000000, Z:4000.000000}
   - { ID:4,X:1500.000000, Y:800.000000, Z:3000.000000}
   - { ID:5,X:500.000000, Y:1800.000000, Z:3000.000000}
   - { ID:6,X:500.000000, Y:1800.000000, Z:4000.000000}

My question is now, how to transform P1, P2 and P3 to a form that allows a meaningful triangulation? I need to compute the correct 3D points using trifocal tensor.

magarwal
  • 564
  • 4
  • 17

1 Answers1

10

The trifocal tensor won't help you, because like the fundamental matrix, it only enables projective reconstruction of the scene and camera poses. If X0_j and P0_i are the true 3D points and camera matrices, this means that the reconstructed points Xp_j = inv(H).X0_j and camera matrices Pp_i = P0_i.H are only defined up to a common 4x4 matrix H, which is unknown.

In order to obtain a metric reconstruction, you need to know the calibration matrices of your cameras. Whether you know these matrices (e.g. if you use virtual cameras for image rendering) or you estimated them using camera calibration (see OpenCV calibration tutorials), you can find a method to obtain a metric reconstruction in §7.4.5 of "Geometry, constraints and computation of the trifocal tensor", by C.Ressl (PDF).

Note that even when using this method, you cannot obtain an up-to-scale 3D reconstruction, unless you have some additional knowledge (such as knowledge of the actual distance between two fixed 3D points).

Sketch of the algorithm:

Inputs: the three camera matrices P1, P2, P3 (projective world coordinates, with the coordinate system chosen so that P1=[I|0]), the associated calibration matrices K1, K2, K3 and one point correspondence x1, x2, x3.

Outputs: the three camera matrices P1_E, P2_E, P3_E (metric reconstruction).

  1. Set P1_E=K1.[I|0]

  2. Compute the fundamental matrices F21, F31. Denoting P2=[A|a] and P3=[B|b], you have F21=[a]x.A and F31=[b]x.B (see table 9.1 in [HZ00]), where for a 3x1 vector e [e]x = [0,-e_3,e_2;e_3,0,-e_1;-e_2,e_1,0]

  3. Compute the essential matrices E21 = K2'.F21.K1 and E31 = K3'.F31.K1

  4. For i = 2,3, do the following

    i. Compute the SVD Ei1=U.S.V'. If det(U)<0 set U=-U. If det(V)<0 set V=-V.

    ii. Define W=[0,-1,0;1,0,0;0,0,1], Ri=U.W.V' and ti = third column of U

    iii. Define M=[Ri'.ti]x, X1=M.inv(K1).x1 and Xi=M.Ri'.inv(Ki).xi

    iv. If X1_3.Xi_3<0, set Ri=U.W'.V' and recompute M and X1

    v. If X1_3<0 set ti = -ti

    vi. Define Pi_E=Ki.[Ri|ti]

  5. Do the following to retrieve the correct scale for t3 (consistantly to the fact that ||t2||=1):

    i. Define p2=R2'.inv(K2).x2 and p3=R3'.inv(K3).x3

    ii. Define M=[p2]x

    iii. Compute the scale s=(p3'.M.R2'.t2)/(p3'.M.R3'.t3)

    iv. Set t3=t3*s

  6. End of the algorithm: the camera matrices P1_E, P2_E, P3_E are valid up to an isotropic scaling of the scene and a change of 3D coordinate system (hence it is a metric reconstruction).

[HZ00] "Multiple view geometry in computer vision" , by R.Hartley and A.Zisserman, 2000.

BConic
  • 8,750
  • 2
  • 29
  • 55
  • how do I compute H when I already know the camera calibration Matrix K? – magarwal Feb 19 '14 at 06:28
  • The PDF link was broken, I fixed it. Did you find the section I was refering to ? – BConic Feb 19 '14 at 07:55
  • I did. But wasn't able to comprehend it to the level of implementation :( – magarwal Feb 19 '14 at 08:04
  • This is pretty advanced stuff. I can give you a sketch of the algorithm, which I am using fine, but chances are you will need to understand the background to implement it correctly ... – BConic Feb 19 '14 at 08:41
  • See the outline of the algorithm. – BConic Feb 19 '14 at 08:58
  • I understood the outlining but I am trying a different approach using trifocal tensor. Since fundamental matrix gives relation between 2 views, similarly trifocal tensor gives relation between 3 views... I already implemented the algorithm using F matrix-> E matrix -> Projection matrices -> triangulation -> Bundle Adjustment -> Scaling offset for metric reconstruction. It worked just fine. Now I wanna try different approach using trifocal tensor. Pleae look at main() function in line 2860: https://github.com/marutiagarwal/trifocal/blob/master/trifocal.cpp – magarwal Feb 19 '14 at 09:56
  • ^^ I already know the corresponding points in 3 images here and I have hardcoded them. Using these points only I compute 3x3x3 tensor matrix. I know camera calibration matrix also (see line 2119 in trifocal.cpp)... I don't know how to get affine reconstruction using this. Afterwards, I can get 'scale' value to obtain metric reconstruction. – magarwal Feb 19 '14 at 09:59
  • "I don't know how to get affine reconstruction using this." > Yes, that was your question, and the algorithm for that is in my answer. – BConic Feb 19 '14 at 10:07
  • More specifically, it does what you said, but consistently for the 3 camera matrices at once, instead of doing it independently for each pair of camera matrices. – BConic Feb 19 '14 at 10:14
  • let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/47836/discussion-between-magarwal-and-aldurdisciple) – magarwal Feb 19 '14 at 10:17
  • Hello, what is the format of ```x1```, ```x2``` and ```x3```. Are they a ```2x1``` array (pixel coordinates), if so how is a matrix multiplication done with ```3x3``` matrix. Or is it of the form ```(u,v,1)```. Thanks – OlorinIstari Dec 21 '20 at 11:47
  • Also, what is the order of the dot product in step 5.3, left to right, or right to left? Thanks – OlorinIstari Dec 21 '20 at 17:56