I am trying to recover a trajectory of a 2D camera, using a sequence of 2D-images and OpenCV. But the trajectory I get is not so good as I would like it to be. It goes back and forth instead of going just forth.
I have a sequence of photos taken on 2D-camera while it was moving (KITTI dataset, outdoors part, namely). For each two sequential frames I compute the rotation matrix (R) and translation vector (t) with E = cv2.findEssentialMat()
and cv2.recoverPose(E, ...)
, and then I estimate the trajectory, assuming that coordinates of every translation vector are given in local coordinate system, which position is set by the corresponding rotation matrix.
upd: Each recovered position looks like [X,Y,Z], and I scatter (X_i, Y_i) for every i (these points are thought to be 2D positions), so the following graphs are my estimated trajectories.
Here's what I get instead of a straight line (the camera was moving straight forward). Previous results were even worse.
The green point is where it starts and the red point is where it ends. So most of the time it even moves backwards. This, though, is probably because of a mistake in the beginning, which was the cause of everything turning around (right?)
Here's what I do:
E, mask = cv2.findEssentialMat(points1, points2, K_00, cv2.RANSAC, 0.99999, 0.1)
inliers, R, t, mask = cv2.recoverPose(E, points1, points2, K_00, R, t, mask)
Seems to me that recoverPose
somehow chooses wrong R
and t
sign on some steps. So the trajectory that was supposed to go forward, goes back. And then forth again.
What I did to improve the situation was:
1) skip the frames with too many outliers (I check this both after using findEssentialMat
and after using recoverPose
)
2) set the threshold for RANSAC method in findEssentialMat
to 0.1
3) increase the number of the feature points on each image from 8 to 24.
This didn't really help.
Here I need to note: I know that on practice, 5-point algorithm, which is used for computing the essential matrix, needs a lot more points than 8 or even 24. And maybe this is actually the problem.
So the questions are:
1) Can the number of feature points (approx. 8-24) be the cause of recoverPose mistakes?
2) If checking the number of outliers if the right thing, then what percentage of outliers should I set as the limitation?
3) I estimate positions like this (instead of simple p[i+1] = R*p[i]+t
):
C = np.dot(R, C)
p[i+1] = p[i] + np.dot(np.linalg.inv(C), t)
This is because I can't help thinking of t
as a vector in local coordinates, so C
is the transformation matrix, which is updated on every step to summarize the rotations. Is that right or not really?
4) It's really possible that I am missing something, since my knowledge of the topic seems tiny. Is there anything (anything!) you could recommend?
Huge thanks for your time! I would appreciate any advice.
upd: for example, here are the first six rotation matrices, translation vectors, and recovered positions I get. Signs of t
seem a bit crazy.
upd: here's my code. (I'm not a really good programmer yet). The main idea is that my feature points are corners of bouding boxes of static objects, which I detect with Faster R-CNN (I used this implementation). So the first part of the code detects objects, and the second part uses detected feature points for recovering the trajectory.
Here's the dataset I use (this is part 2011_09_26_drive_0005 from here).