0

I am using OpenCV for visual odometry. I have a video of a road taken from a monocular camera mounted on a moving car. I would like to obtain the translation vectors between frames.

What I have done so far:

  • I obtained matches of keypoints between one frame and the next one.
  • I then used recoverPose and got the Rotation matrices, the translation vectors (up to scale) and some three dimensional points coordinates.

My issue is that with just two frames I cannot recover the real translation vector (just the direction). But if I find the same point in three different frames and triangulate its 3D coordinates with respect to the first and second reference frame I think I can retrieve the real translation vector between the camera at t0 and t1 as the difference of the coordinates which I find in the two frames.

Is the above statement correct?

Of course it would be better to have multiple points and some kind of voting method. I just want to know if the method is feasible or I am missing some fundamental problem.

Milan Š.
  • 1,353
  • 1
  • 2
  • 11
tomtom
  • 1
  • 1

1 Answers1

0

It is incorrect. You can only recover translation vector up to unknown scale factor from the set of images regardless of technique used. You need another source of real-world information to recover correct scale and get real translation vector. See my answer to similar question.

Piotr Siekański
  • 1,665
  • 8
  • 14
  • Thank you for the answer @Piotr but I still struggle on why. I assume that the triangulated 3d points obtained from 'recoverPose' are expressed wrt the camera coords of the first frame used. If I just have two images it's clear to me that the scale cannot be retrieve (an infinite number of second camera position allow the same triangulation) but I thought that a third clear this up. If it doesn't I struggle find meaning in the coords of the 3d points given by 'recoverPose' since for a given point I have two sets of coords wrt two different reference frames. – tomtom Jan 05 '23 at 16:13