Pose estimation using PNP : Strange wrong results

Question

I am trying to use the PNP algorithm implementations in Open CV (EPNP, Iterative etc.) to get the metric pose estimates of cameras in a two camera pair (not a conventional stereo rig, the cameras are free to move independent of each other). My source of images currently is a robot simulator (Gazebo), where two cameras are simulated in a scene of objects. The images are almost ideal: i.e., zero distortion, no artifacts.

So to start off, this is my first pair of images.

I assume the right camera as "origin". In metric world coordinates, left camera is at (1,1,1) and right is at (-1,1,1) (2m baseline along X). Using feature matching, I construct the essential matrix and thereby the R and t of the left camera w.r.t. right. This is what I get.

R in euler angles: [-0.00462468, -0.0277675, 0.0017928]
t matrix: [-0.999999598978524; -0.0002907901840156801; -0.0008470441900959029]

Which is right, because the displacement is only along the X axis in the camera frame. For the second pair, the left camera is now at (1,1,2) (moved upwards by 1m).

Now the R and t of left w.r.t. right become:

R in euler angles: [0.0311084, -0.00627169, 0.00125991]
t matrix: [-0.894611301085138; -0.4468450866008623; -0.0002975759140359637]

Which again makes sense: there is no rotation; the displacement along Y axis is half of what the baseline (along X) is, so on, although this t doesn't give me the real metric estimates.

So in order to get metric estimates of pose in case 2, I constructed the 3D points using points from camera 1 and camera 2 in case 1 (taking the known baseline into account: which is 2m), and then ran the PNP algorithm with those 3D points and the image points from case 2. Strangely, both ITERATIVE and EPNP algorithms give me a similar and completely wrong result that looks like this:

Pose according to final PNP calculation is: 
Rotation euler angles: [-9.68578, 15.922, -2.9001]
Metric translation in m: [-1.944911461358863; 0.11026997013253; 0.6083336931263812]

Am I missing something basic here? I thought this should be a relatively straightforward calculation for PNP given that there's no distortion etc. ANy comments or suggestions would be very helpful, thanks!

Your comment "(1,1,2) (moved upwards by 1m)" strikes me as odd. OpenCV has Z forward and Y down. Maybe your robot simulator uses GL coordinate system ? — Joan Charmant, Nov 17 '15 at 11:16
True, but OpenCV does not use those world values, I was just mentioning those for clarity. The projection matrices are constructed using the R and t obtained from decomposing the essential matrix. — HighVoltage, Nov 17 '15 at 14:11

Pose estimation using PNP : Strange wrong results

0 Answers0