Homography maps points
1. On plane to points at another plane
2. Projections of points in 3D (no obligatory lying on the same plane) during a pure camera rotation or zoom.
The latter can be easily verified if you look at the rays that connect points while sensor plane rotates: green are two sensor positions and black is a 3d object
Since Homography is between projections and not between objects in 3D you don’t care what these projections represent. But this can be confusing, I agree. For example you can point your camera at 3D scene (that is not flat!), then rotate your camera and the two resulting pictures of the scene will be related by homography. This is, by the way, a foundation for image panoramas.
Three point correspondences you mentioned may be reladte to a transformation called Affine (happens during large zooms when a perspective effects disappears) or to the finding a rigid rotation and translation in 3D space. Both require 3 point correspondences but the former needs only 2D points while the latter needs 3D points. The latter case has 6DOF ( 3 for rotation and 3 for translation) while each correspondence provides 2DOF, hence 6/2=3 correspondences. Homography has 8 DOF so there should be 8/2=4 correspondences;
Below is a little diagram that explains the difference between affine and homographs transformation when the original square tilts forward. In affine case the perspective effect is negligible that is far side has the same length as a near one. In the case of Homography the far side is shorter.