Say I have an object, and I obtained RGB data and Depth data from a kinect on one angle. Then I moved the kinect slightly so that I can take a second picture of the object from a different angle. I'm trying to figure out the best way to determine how much the kinect camera has translated and rotated from its original position based on the correspondences from both images.
I'm using OpenCV for certain image processing tasks and I've looked into the SURF algorithm, which seems to find good correspondences from two 2D images. However it doesn't take into account the depth data, and I don't think it works very nice with multiple pictures. Additionally, I can't figure out how you can obtain the translation/rotation data from the correspondence.. Perhaps I'm looking in the wrong direction?
Note: My long-term goal is to "merge" the multiple images to form a 3D model from all the data. At the moment it somewhat works if I specify the angles that the kinect is located at, but I think it's much better to reduce the "errors" involved (i.e. points shifted slightly from where they should be) by finding correspondences instead of specifying parameters manually