1

I want to extract a depth map from a calibrated image and triangular mesh using OpenCV called from Matlab 2014b (using the OpenCV bindings). I am a regular user of Matlab but am new to OpenCV. I have the following inputs:

im - an undistorted RGB image of the scene

or - camera position vector

R - rotation matrix describing camera pose

points - nx3 triangular mesh vertex list representing the scene

faces - mx3 triangular mesh face list

EFL - image effective focal length in pixels

I have written a native Matlab ray tracing engine to extract a depth map from these inputs, but this is quite slow and suffers from high reprojection errors (I want to compare the results from OpenCV functions to my own to establish if these errors relate to my implementation or just a camera calibration inaccuracies).

How can a depth map be obtained from these inputs using OpenCV called from Matlab?

Any help would be greatly appreciated

Thanks

Thomas

1 Answers1

2

Proposed strategy

You could project the vertices from your mesh into 2D pixel coordinates (using your calibrated camera model). Then for each face, you can find all of the pixel centres (lattice points) contained in the 2D triangle formed by its projected vertices. You may have to keep track of which triangle is the nearest in the case of overlap. Now you know which face corresponds to each pixel. This should be very fast unless your mesh is much higher resolution than your image.

You can then find the 3D ray corresponding to each pixel using the camera model, and intersect the ray with the known face for that pixel to calculate the depth (sounds like you already did this part). This shouldn't take too long either, now that you know the plane.

More info on the camera projection

OpenCV has a good resource on using the camera model (below). Basically, you can project 3D point M' to pixel coordinate m'; this is how you project your vertices to pixel positions. Going the other direction, scale is unrecoverable -- you get the ray M'/s rather than the point M'. The depth you're looking for is s, which is the 3D point's Z coordinate in the camera frame. If your mesh is in a camera-centric frame (X right, Y down, Z out), R = Identity and t = 0. If it's not, [R|t] transforms it to be.

Compact

Expanding each factor lets us see the makeup of the matrices.

Expanded

The code that you suggested below uses OpenCV's projectPoints function, which implements the above equation plus some distortion calibration (see main OpenCV reference). You have to populate the matrices and it multiplies them. An alternative example for projectPoints is available on GitHub, and I believe this same example is discussed in this SO question.

Code suggested by asker

Apparently the following code does the job. I may need some time to pick through it given that my C++ knowledge is practically zero (I realise that it is commented out BTW):

       //CString str;
       //cv::Mat CamMatrix(3, 3, CV_64F);
       //cv::Mat distCoeffs(5, 1, CV_64F);
       //m_CamCalib.GetOpenCVInfo(&CamMatrix, &distCoeffs);
       //vector<Point3d> GCP_Points;
       //vector<Point2d> Image_Points;
       //cv::Mat RVecs(3, 3, CV_64F); // rotation matrix
       //cv::Mat TranRVecs(3, 3, CV_64F); // rotation matrix
       //cv::Mat TVecs(3, 1, CV_64F); // translation vector
       //RVecs.at<double>(0, 0) = m_CamPosMtrx.m_pMtrx[0];
       //RVecs.at<double>(1, 0) = m_CamPosMtrx.m_pMtrx[1];
       //RVecs.at<double>(2, 0) = m_CamPosMtrx.m_pMtrx[2];

       //RVecs.at<double>(0, 1) = m_CamPosMtrx.m_pMtrx[4];
       //RVecs.at<double>(1, 1) = m_CamPosMtrx.m_pMtrx[5];
       //RVecs.at<double>(2, 1) = m_CamPosMtrx.m_pMtrx[6];

       //RVecs.at<double>(0, 2) = m_CamPosMtrx.m_pMtrx[8];
       //RVecs.at<double>(1, 2) = m_CamPosMtrx.m_pMtrx[9];
       //RVecs.at<double>(2, 2) = m_CamPosMtrx.m_pMtrx[10];
       //transpose(RVecs, TranRVecs);
       //TVecs.at<double>(0, 0) = 0;
       //TVecs.at<double>(1, 0) = 0;
       //TVecs.at<double>(2, 0) = 0;
       //GCP_Points.push_back(Point3d((x - m_CamPosMtrx.m_pMtrx[12]), (y - m_CamPosMtrx.m_pMtrx[13]), (z - m_CamPosMtrx.m_pMtrx[14])));
       //Image_Points.push_back(Point2d(0, 0));
       //projectPoints(GCP_Points, TranRVecs, TVecs, CamMatrix, distCoeffs, Image_Points);

/bool CCameraCalibration::GetOpenCVInfo(Mat * cameraMatrix, Mat * distCoeffs)
//{
//            int i,j;
//            Mat projMatrix;
//            CMatrix4x4 m1;
//            if(cameraMatrix->rows==0) cameraMatrix->create(3,3, CV_64F);
//            if(distCoeffs->rows==0) distCoeffs->create(5, 1, CV_64F);
//            for(i=0;i<3;i++)
//            for(j=0;j<3;j++){
//                   cameraMatrix->at<double>(i,j)=m_pCameraMatrix[i][j];
//            }
//            for(i=0;i<5;i++)
//                   distCoeffs->at<double>(i,0)=m_pCoefficients[i];
//     return false;
//}
Community
  • 1
  • 1
kmac
  • 688
  • 4
  • 15
  • I like your approach. At present, the Matlab code uses Octree subdivision to partition the mesh (re-meshing the octree split TIN model is the first bottleneck: ismember search function in Matlab scales very poorly). I then locate box-ray intersection using Smit's algorithm then search the contents of each box and solve Ray-tri intersections using Moller and Trumbore algorithm. Though your method seems better, speed is not my main issue: I am struggling with the setup of the ray-tree as this seems to be where the re-projection errors are coming from. How to set this using OpenCV? Thanks. Thomas – Thomas Seers Nov 17 '15 at 14:55
  • Hmm, what happens when a face passes through several octree subdivisions? Do you add it to each leaf, just one, or split the face into several triangles for each subdivision? Doing this search in 3D is definitely adding complexity here. Even though it means starting a new method, the 2D search might still be easier to get working. – kmac Nov 17 '15 at 21:47
  • Cheers kmac. Spatial partitioning and ray box search is pretty standard in the ray tracing literature (i.e. BSP trees). As I think you have spotted, a shared triangle is duplicated in spanned bins (this search really kills the speed of the operation). I have acquired some C++ code from a colleague that is supposed to perform the pixel projection using OpenCV. As a C++/OpenCV noob, I am still picking through it tbh. Perhaps it will be of assistance to someone else out there.... I will post it below: – Thomas Seers Nov 17 '15 at 23:24
  • A good resource on using the pinhole camera model (for the 3D to 2D projection and the 2D to 3D ray) in OpenCV [can be found here](http://docs.opencv.org/2.4/modules/calib3d/doc/camera_calibration_and_3d_reconstruction.html). I'll expand on it a bit in the answer. – kmac Nov 18 '15 at 03:00
  • Back to your original point of setting up the ray-tree in OpenCV -- I haven't done any ray tracing in OpenCV; maybe someone else will be able to suggest the complimentary full-3D approach. – kmac Nov 18 '15 at 03:56
  • Thanks kmac. I think that OpenCV on its own is not appropriate for the ray tracing part. I have found the following functions which use the OpenSceneGraph and OpenCV to extract the depth map info called from Matlab: http://www.openu.ac.il/home/hassner/projects/poses/. I will still try your suggestions ;). I tried to vote you up but am too much of a Noob for it to register! Cheers. Thomas. – Thomas Seers Nov 18 '15 at 09:54
  • Haha, thanks. You can click the checkmark to mark the question as answered. – kmac Nov 18 '15 at 20:52