Find world space coordinate for pixel in OpenCV

Question

I need to find the world coordinate of a pixel using OpenCV. So when I take pixel (0,0) in my image (that's the upper-left corner), I want to know to what 3D world space coordinate this pixel corresponds to on my image plane. I know that a single pixel corresponds to a line of 3D points in world space, but I want specific the one that lies on the image plane itself.

This is the formula of the OpenCV Pinhole model of which I have the first (intrinsics) and second (extrinsics) matrices. I know that I have u and v, but I don't know how to get from this u and v to the correct X, Y and Z coordinate.

Pinhole Model Opencv

What I've tried already:

I thought to just set s to 1 and make a homogeneous coordinate from [u v 1]^T by adding a 1, like so: [u v 1 1]^T. Then I multiplied the intrinsics with the extrinsics and made it into a 4x4 matrix by adding the following row: [0 0 0 1]. This was then inverted and multiplied with [u v 1 1]^T to get my X, Y and Z. But when I checked if four pixels calculated like that lay on the same plane (the image plane), this was wrong.

So, any ideas?

if you compute the inverse of the intrinsics, you can multiply it to the left of both sides. then you have to revert the extrinsics (is there a `left-inverse`?). At the end you should have some formula giving you the pixel-to-world coordinates depending on your choice of `s`. If you create a formula that describes your 3D-world-plane, you can compute the single `s` that hits the plane. — Micka, Jan 19 '15 at 10:24
adding another [1] as you've done might work too, just keep the `s` ... I guess it should be `[s*u,s*v,s,1]` but I'm not sure ;) — Micka, Jan 19 '15 at 10:41

score 5 · Accepted Answer · answered Jan 20 '15 at 01:14

IIUC you want the intersection I with the image plane of the ray that back-projects a given pixel P from the camera center.

Let's define the coordinate systems first. The usual OpenCV convention is as follows:

Image coordinates: origin at the top-left corner, u axis going right (increasing column) and v axis going down.
Camera coordinates: origin at the camera center C, z axis going toward the scene, x axis going right and y axis going downward.

Then the image plane in camera frame is z=fx, where fx is the focal length measured in pixels, and a pixel (u, v) has camera coordinates (u - cx, v - cy, fx).

Multiply them by the inverse of the (intrinsic) camera matrix K you'll get the same point in metrical camera coordinates.

Finally, multiply that by the inverse of the world-to-camera coordinate transform [R | t] and you'll get the same point in world coordinates.

how could you be sure that [R|T] is invertible in this case? — Yingqiang Gao, Nov 20 '17 at 11:21

Find world space coordinate for pixel in OpenCV

1 Answers1

Linked