I am trying to get simulated depth image of an object model in OpenGL given the intrinsics parameters of a RealSense camera, fx fy ppx ppy
.
I construct the projection matrix myself with the help from this, that and also that. Somehow the result is not correct and I got confused with their conventions of intrinsics. They are using alpha beta
which I believe are different from the focal length f
.
My understanding of transforming from world coordinates to pixel coordinates is to use the intrinsics matrix. The projection from frustum to NDC also transforms between world and pixel coordinates. I don't understand how to combine them.
The second answer in this uses the exact coordinate system as mine but it does not work in my case as I still can't see any thing in my window. It will be great if someone can point to me on how to derive it from the basics.
This is my original working code to display.
glMatrixMode(GL_PROJECTION);
glPushMatrix();
gluPerspective(60, (float)(*width) / (*height), zNear, zFar);
glMatrixMode(GL_MODELVIEW);
glPushMatrix();
gluLookAt(camOrigin[0], camOrigin[1], camOrigin[2], camOrigin[0] + camLookAt[0], camOrigin[1] + camLookAt[1], camOrigin[2] + camLookAt[2], camUp[0], camUp[1], camUp[2]);
And I changed to this to make use of the intrinsics (I tried the transpose to see if it helps):
static const double fx = intrin.fx, fy = intrin.fy, cx = intrin.ppx, cy = intrin.ppy, zNear = 0.01, zFar = 20.0, s = 0;
glMatrixMode(GL_PROJECTION);
glPushMatrix();
GLdouble perspMatrix[16] = { 2 * fx / *width, 0, 0, 0,
2 * s / *width, 2 * fy / *height, 0, 0,
2 * (cx / *height) - 1, 2 * (cy / *height) - 1, (zFar + zNear) / (zFar - zNear), 1,
0, 0, 2 * zFar*zNear / (zNear - zFar), 0};
//GLdouble perspMatrix[16] =
//{ 2 * fx / *width, 2 * s / *width, 2 * (cx / *height) - 1, 0,
//0, 2 * fy / *height, 2 * (cy / *height) - 1, 0,
//0, 0, (zFar + zNear) / (zFar - zNear), 2 * zFar*zNear / (zNear - zFar),
//0, 0, 1, 0 };
glMultMatrixd(perspMatrix);
glMatrixMode(GL_MODELVIEW);
glPushMatrix();
gluLookAt(camOrigin[0], camOrigin[1], camOrigin[2], camOrigin[0] + camLookAt[0], camOrigin[1] + camLookAt[1], camOrigin[2] + camLookAt[2], camUp[0], camUp[1], camUp[2]);
Any idea on what is wrong?