I'm trying to implement some kind of object detection with depth measurement using a mono-camera which is mounted near by the center of rotation of a door. The basic idea is to detect objects in front of the door (when swinging up) and create a collision warning if the door cannot open completely.
There are some problems I have to deal with:
- Little movement: The camera rotates from 0°-90° and there is only limited translation (due to camera position)
- The camera lense is a fisheye (AOV ~100° vertical and 180° horizontal)
What I tried:
Structure from motion:
Even with a full sequence (rotation from 0°-90°) I only get a really bad scene reconstruction. Due to the little camera movement I believe it is really hard to get good results with this approach (especially because I cannot to use a full video sequence when it comes to collision warning).
Faking stereo reconstruction:
I used two consecutive images, rectified them and tried to generate a disparity map. Almost every time the epipole lies inside the images which leads to high amount of distortion when rectifying. But even when the epipole lies outside the images (rectification is possible) the disparity map is unusable (fragments, no distinctive structures).
I would appreciate if someone has some ideas what I can try here. Maybe Optical Flow is suitable here?
I'm also interested in some sort of (mathematical) proof when a solution is not really possible. Is there a minimum amount of movement needed to get good reconstruction results with algorithms like SfM? Is translation better than rotation for SfM?