0

I'm trying to implement some kind of object detection with depth measurement using a mono-camera which is mounted near by the center of rotation of a door. The basic idea is to detect objects in front of the door (when swinging up) and create a collision warning if the door cannot open completely.

There are some problems I have to deal with:

  • Little movement: The camera rotates from 0°-90° and there is only limited translation (due to camera position)
  • The camera lense is a fisheye (AOV ~100° vertical and 180° horizontal)

What I tried:

  • Structure from motion:

    Even with a full sequence (rotation from 0°-90°) I only get a really bad scene reconstruction. Due to the little camera movement I believe it is really hard to get good results with this approach (especially because I cannot to use a full video sequence when it comes to collision warning).

  • Faking stereo reconstruction:

    I used two consecutive images, rectified them and tried to generate a disparity map. Almost every time the epipole lies inside the images which leads to high amount of distortion when rectifying. But even when the epipole lies outside the images (rectification is possible) the disparity map is unusable (fragments, no distinctive structures).

I would appreciate if someone has some ideas what I can try here. Maybe Optical Flow is suitable here?

I'm also interested in some sort of (mathematical) proof when a solution is not really possible. Is there a minimum amount of movement needed to get good reconstruction results with algorithms like SfM? Is translation better than rotation for SfM?

Matthias Preu
  • 783
  • 2
  • 8
  • 18
  • Hum, if anything is close enough to a door to be hit when the door opens, it is close enough to be detected using a sonar-based proximity sensor (or, if it moves, any kind of range-limited motion sensor). Why do you want to use a camera in such an odd configuration? Is it because it's already there for some other reason? – Francesco Callari Oct 14 '15 at 18:13

1 Answers1

1

I believe you're already on to why this is very hard. Rotation only reconstruction is not possible without additional scene dependent hints / clues.

You may want to take a look at the following paper, which uses indoor environment assumptions to do 3D reconstruction from a single image:

E. Delage, Honglak Lee, and A.Y. Ng. A dynamic bayesian network model
for autonomous 3d reconstruction from a single indoor image. In Computer
Vision and Pattern Recognition, 2006 IEEE Computer Society Conference
on, volume 2, pages 2418?2428, 2006.

Probably the best solution would be to change the mounting of the camera so that the hinge rotation would apply a translation as well.

Photon
  • 3,182
  • 1
  • 15
  • 16
  • Thanks for your reply. I heard already that rotation only reconstruction is not possible but I didn't find a good resource where this is explicitly explained or proofed. Maybe you have more information about that? Thanks for your paper suggestion. – Matthias Preu Oct 13 '15 at 15:21
  • 2
    Section 9.2.2 in ZH (Hartley, Zisserman - Multiple View Geometry in Computer Vision) shows that the fundamental matrix relies on a non zero translation. Section 12.1 states that a fundamental matrix is necessary for reconstruction. – Photon Oct 14 '15 at 06:26