Relative pose estimation using essential matrix: Wrong R and T

Question

I am trying to use the essential matrix method in opencv to obtain the R and t of one camera pose with respect to another. The procedure I am following is:

Mark features using SIFT
Match features using FLANN matching
Compute fundamental matrix.
Compute essential matrix
Perform SVD to obtain U, W, Vt
Check to see which combination of R and t is correct according to whether or not the normalized/homogenized points are in front of the camera.

For a simple check, I tested this with an image pair but of the same image twice (so that neither the camera nor the image points have moved), hence the translation vector should be null and the rotation should be identity. But the output of the program ends up being wrong.

The fundamental matrix is
[[  3.59955121e-17  -5.77350269e-01   2.88675135e-01]
 [  5.77350269e-01   5.55111512e-17   2.88675135e-01]
 [ -2.88675135e-01  -2.88675135e-01   0.00000000e+00]]

Fundamental matrix error check: 0.000000

The essential matrix is
[[  4.51463713e-10  -7.25229650e+06  -2.37367600e+06]
 [  7.25229650e+06   6.98357978e-10   4.27847619e+06]
 [  2.37367600e+06  -4.27847619e+06  -1.33013600e-10]]

Translation matrix is
[-0.48905495 -0.2713251   0.82898007]

Rotation matrix is
[[ 0.52165052 -0.26538577  0.8108336 ]
 [-0.26538577  0.85276538  0.4498462 ]
 [ 0.8108336   0.4498462  -0.3744159 ]]
Roll: -26.965168, Pitch: 129.775110, Yaw: -54.179055

I also used this code with a pair of cameras, displaced by a certain distance in X: but the Euler angles and translation I obtain using this technique (there, I consider two camera matrices instead of one) are still wrong. The translation vector tells me that I've moved in both X and Z, and the rotation matrix is not accurate. I am confused as to what might be going wrong here. Any suggestions would be very helpful. Thank you!

EDIT: My code can be viewed here

try to find out WHICH STEP of your computation is erroneous. My suggestion for testing: replace `1. Mark features using SIFT; 2. Match features using FLANN matching` by some ground truth correspondences. — Micka, Jul 29 '15 at 08:40
Hi Micka, as I was using the same picture twice, the correspondences should be exactly the same in both the lists. Shouldn't that solve the correspondences issue, or are you suggesting using like a chessboard or something of that sort? — HighVoltage, Jul 29 '15 at 18:20
not sure whether those algorithms work if there is no movement at all. — Micka, Jul 29 '15 at 18:22
@HighVoltage, as you are using the same image, you do not really have a stereo pair. The "two" camera centers are located at the same place and the epipolar geometry does not hold. Thus, no matter what you get from your calculation, rest assure it is wrong. This can be easily seen from: F = K2^(-T) R K^(T)[e]x and e = K2 t, but t = [0, 0, 0] thus e = [0, 0, 0] and so F cannot be computed. — Tal J. Levy, Jun 19 '17 at 08:52

score 0 · Answer 1 · answered Feb 07 '17 at 02:57

0

I think before proceeding with matching features, you need to un-distort the images using the camera matrices and distortion coefficients. I know this is too late, but I hope this helps others.

answered Feb 07 '17 at 02:57

troymyname00

670
1
14
32

Relative pose estimation using essential matrix: Wrong R and T

1 Answers1