I have a calibrated (virtual) camera in Blender that views a roughly planar object. I make an image from a first camera pose P0 and move the camera to a new pose P1. So I have the 4x4 camera matrix for both views from which I can calculate the transformation between the cameras as given below. I also know the intrinsics matrix K. Using those, I want to map the points from the image for P0 to a new image seen from P1 (of course, I have the ground truth to compare because I can render in Blender after the camera has moved to P1). If I only rotate the camera between P0 and P1, I can calculate the homography perfectly. But if there is translation, the calculated homography matrix does not take that into account. The theory says, after calculating M10, the last row and column should be dropped for a planar scene. However, when I check M10, I see that the translation values are in the rightmost column, which I drop to get the 3x3 homography matrix H10. Then, if there is no rotation, H10 is equal to the identity matrix. What is going wrong here?
Edit: I know that the images are related by a homography because given the two images from P0 and P1, I can find a homography (by feature matching) that perfectly maps the image from P0 to the image from P1, even in presence of a translational camera movement.