I am using OpenCV for visual odometry. I have a video of a road taken from a monocular camera mounted on a moving car. I would like to obtain the translation vectors between frames.
What I have done so far:
- I obtained matches of keypoints between one frame and the next one.
- I then used
recoverPose
and got the Rotation matrices, the translation vectors (up to scale) and some three dimensional points coordinates.
My issue is that with just two frames I cannot recover the real
translation vector (just the direction). But if I find the same point
in three different frames and triangulate its 3D coordinates with
respect to the first and second reference frame I think I can retrieve
the real translation vector between the camera at t0
and t1
as the
difference of the coordinates which I find in the two frames.
Is the above statement correct?
Of course it would be better to have multiple points and some kind of voting method. I just want to know if the method is feasible or I am missing some fundamental problem.