2

I need to calculate the X,Y coordinates in the world with respect to the camera using u,v coordinates in the 2D image. I am using an S7 edge camera to send a 720x480 video feed to MATLAB.

What I know: Z i.e the depth of the object from the camera, size of the camera pixels (1.4um), focal length (4.2mm)

Let's say the image point is at (u,v) = (400,400).

My approach is as follows:

  1. Subtract the pixel value of center point (240,360) from the u,v pixel coordinates of the point in the image. This should give us the pixel coordinates with respect to the camera's optical axis (z axis). The origin is now at the center of the image. So new coordinates are: (160, -40)
  2. Multiply the new u,v pixel values with pixel size to obtain the distance of the point from the origin in physical units. Let's call it (x,y). We get (x,y) = (0.224,-0.056) in mm units.
  3. Use the formula X = xZ/f & Y = yZ/f to calculate X,Y coordinates in the real world with respect to the camera's optical axis.

Is my approach correct?

Wahaj Ahmad
  • 53
  • 1
  • 7

1 Answers1

4

Your approach is going in the right way, but it would be easier if you use a more standardize approach. What we usually do is use Pinhole Camera Model to give you a transformation between the world coordinates [X, Y, Z] to the pixel [x, y]. Take a look in this guide which describes step-by-step the process of building your transformation.

Basically you have to define you Internal Camera Matrix to do the transformation:

enter image description here

  • fx and fy are your focal length scaled to use as pixel distance. You can calculate this with your FOV and the total pixel in each direction. Take a look here and here for more info.
  • u0 and v0 are the piercing point. Since our pixels are not centered in the [0, 0] these parameters represents a translation to the center of the image. (intersection of the optical axis with the image plane provided in pixel coordinates).

  • If you need, you can also add a the skew factor a, which you can use to correct shear effects of your camera. Then, the Internal Camera Matrix will be:

enter image description here

Since your depth is fixed, just fix your Z and continue the transformation without a problem.

Remember: If you want the inverse transformation (camera to world) just invert you Camera Matrix and be happy!

Matlab has also a very good guide for this transformation. Take a look.

Leonardo Mariga
  • 1,166
  • 9
  • 17
  • Hey @Leonardo Mariga, I followed the MATLAB tutorial for camera calibration. It gives me the intrinsic camera matrix. I am unsure how to perform the inversion of the camera matrix that you have suggested. The complete camera model involves camera's extrinsic matrix as well. The extrinsic matrix computed using calibration through matlab are relative to the checkerboard i have used, right? How do I know the extrinsic matrix for a different application? I am trying to detect a doorknob, and using centroid of the knob in pixels, calculate its position in real world given depth of knob from camera – Wahaj Ahmad Jun 11 '20 at 22:18
  • Using the focal length in pixels provided in the intrinsic matrix, I'm using the following two equations to find the X,Y values in the real world. x = (u - cx) * z / fx & y = (v - cv) * z / fy where u,v are pixel coordinates of knob's centroid, cx,cv are centre points of the image, z is the depth of camera lens to doorknob in mm, fx,fy are focal length in pixels given by the intrinsic matrix. The results are off by around 100 mm in y direction and 50mm in x direction – Wahaj Ahmad Jun 11 '20 at 22:22
  • About the inverse, you can exclude the last column and make do inverse of the 3x3 resulting matrix. Take a look [here](https://math.stackexchange.com/questions/2237994/back-projecting-pixel-to-3d-rays-in-world-coordinates-using-pseudoinverse-method). Yes, it's relative to the checkerboard. For other application you have o calibrate the new camera and have other ways to obtain the rotation and translation of the camera (ARUCO, as example) – Leonardo Mariga Jun 11 '20 at 22:41
  • I realize that this is an "old" post, but I am trying to do something similar to this, but unfortunately I do not entirely understand how to get to the result with this info? I have all my camera parameters and created my matrix, but how to I translate my 2D coordinates to 3D with it? – Martin Pedersen May 10 '22 at 12:28