0

I am trying to use the PNP algorithm implementations in Open CV (EPNP, Iterative etc.) to get the metric pose estimates of cameras in a two camera pair (not a conventional stereo rig, the cameras are free to move independent of each other). My source of images currently is a robot simulator (Gazebo), where two cameras are simulated in a scene of objects. The images are almost ideal: i.e., zero distortion, no artifacts.

So to start off, this is my first pair of images.

enter image description here enter image description here

I assume the right camera as "origin". In metric world coordinates, left camera is at (1,1,1) and right is at (-1,1,1) (2m baseline along X). Using feature matching, I construct the essential matrix and thereby the R and t of the left camera w.r.t. right. This is what I get.

R in euler angles: [-0.00462468, -0.0277675, 0.0017928]
t matrix: [-0.999999598978524; -0.0002907901840156801; -0.0008470441900959029]

Which is right, because the displacement is only along the X axis in the camera frame. For the second pair, the left camera is now at (1,1,2) (moved upwards by 1m).

enter image description here enter image description here

Now the R and t of left w.r.t. right become:

R in euler angles: [0.0311084, -0.00627169, 0.00125991]
t matrix: [-0.894611301085138; -0.4468450866008623; -0.0002975759140359637]

Which again makes sense: there is no rotation; the displacement along Y axis is half of what the baseline (along X) is, so on, although this t doesn't give me the real metric estimates.

So in order to get metric estimates of pose in case 2, I constructed the 3D points using points from camera 1 and camera 2 in case 1 (taking the known baseline into account: which is 2m), and then ran the PNP algorithm with those 3D points and the image points from case 2. Strangely, both ITERATIVE and EPNP algorithms give me a similar and completely wrong result that looks like this:

Pose according to final PNP calculation is: 
Rotation euler angles: [-9.68578, 15.922, -2.9001]
Metric translation in m: [-1.944911461358863; 0.11026997013253; 0.6083336931263812]

Am I missing something basic here? I thought this should be a relatively straightforward calculation for PNP given that there's no distortion etc. ANy comments or suggestions would be very helpful, thanks!

HighVoltage
  • 722
  • 7
  • 25
  • Your comment "(1,1,2) (moved upwards by 1m)" strikes me as odd. OpenCV has Z forward and Y down. Maybe your robot simulator uses GL coordinate system ? – Joan Charmant Nov 17 '15 at 11:16
  • True, but OpenCV does not use those world values, I was just mentioning those for clarity. The projection matrices are constructed using the R and t obtained from decomposing the essential matrix. – HighVoltage Nov 17 '15 at 14:11

0 Answers0