The perspective transformation from 3D object to 2D image plane is:
s[u v 1]^t = A[R T][X Y Z 1]^t
where the A is camera params that are known.
In Matlab, we can use an "extrinsic" function to calculate R and T given four corresponding image points and world points: [u v]
and [X Y]
.
However, there are 13 variables (including s
), and we only have 12 equations here. (BTW, I set Z = 0
, is this right? or Z
can be any value?). How can I compute s
, R
and T
? What's the math process of it?