Space represented by a single Kinect pixel at a given depth

Question

Basically I want to take a fixed straight line across the devices point of view and determine if anything intercepts it but in my example I want to make the "laser line" configurable with regards to the distance from the top of the field of view.

Now it's easy enough to get the depth data at a given pixel point simply by doing this.

var depthInMM = DepthImagePixel.Depth;

and its also easy to simply say I want to focus on the 100th line of pixels from the top by doing something like this.

for (int i = 0; i < this._DepthPixels.Length; ++i) //_DepthPixels.Length is obviously 307200 for 640x480
{
    if (i >= 64000 && i <= 64640) //Hundredth vertical pixel line
    {
        //Draw line or whatever
    }
}

Which ends up with something like this.

enter image description here

BUT for example I might want to have the line intercept at 50 cm from the top of the field of view at 3 meters depth. Now obviously I understand that as the depth increases so does the area represented but I cannot find any reference or myself work out how to calculate this relationship.

So, how can one calculate the coordinate space represented at a given depth utilizing the Kinect sensor. Any help sincerely appreciated.

EDIT:

So if I understand correctly this can be implemented as such in C#

double d = 2; //2 meters depth
double y = 100; //100 pixels from top
double vres = 480; //480 pixels vertical resolution
double vfov = 43; //43 degrees vertical field of view of Kinect
double x = (2 * Math.Sin(Math.PI * vfov / 360) * d * y) / vres;
//x = 0.30541768893691434
//x = 100 pixels down is 30.5 cm from top field of view at 2 meters depth

You example looks correct, got the same result. – Daniel Brückner Apr 28 '14 at 19:33 — Daniel Brückner, Apr 28 '14 at 19:33

Daniel Brückner · Accepted Answer · 2014-04-28T18:20:26.853

2

     2 sin(PI VFOV / 360) D Y
X = --------------------------
              VRES

X: distance of your line from the top of the image in meters

D: distance - orthogonal to the image plane - of your line from the camera in meters

Y: distance of your line from the top of the image in pixels

VRES: vertical resolution of the image in pixels

VFOV: vertical field of view of the camera in degrees

edited Apr 28 '14 at 18:20

answered Apr 28 '14 at 18:15

Daniel Brückner

59,031
16
99
143

Absolute legend! Thank you so much. Any chance you can verify my understanding above? Can I replace vertical resolution and vertical field of view with horizontal resolution and horizontal field of view to calculate the horizontal distance? Lastly, any chance you could forward me a link to exactly what maths this is based on (or feel free to elaborate). Again sincerely appreciate your help.. – Maxim Gershkovich Apr 28 '14 at 19:09
It will work the same horizontally. And it's just basic [trigonometry](http://en.wikipedia.org/wiki/Trigonometry). – Daniel Brückner Apr 28 '14 at 19:15
lol, I gathered that much but was hoping for the specific formula (or subsection of trig) its based on. As you may gather my maths is far from great and trying to improve. Either way, thanks again... – Maxim Gershkovich Apr 28 '14 at 19:16
It's essentially just `y = r sin(theta)` in the triangle formed by the two line from the camera through the center pixel of the image and the topmost center pixel and the plane at distance `D`. The trick is to look at only half of the view frustum - that is why it is `PI VFOV / 360` instead of `2 PI VFOV / 360` - to get a right angle and of course to do it in 2D, i.e. only in the intersection of a plane and the view frustum. [This image](http://commons.wikimedia.org/wiki/File:Oblique_perspective_view_frustum.png) has at least the center line, can't find a better image. – Daniel Brückner Apr 28 '14 at 19:28
Much appreciated. Writing 3D based software around the Kinect has been quite a challenge without this knowledge. Wish I could just download a brain like yours... :-p – Maxim Gershkovich Apr 28 '14 at 19:37
VRES = 720 # vertical resolution of the image in pixels HRES = 1280 # horizontal resolution of the image in pixels VFOV = 120 # vertical field of view of the camera in degrees HFOV = 120 # horizontal field of view of the camera in degrees def __init__(self, x, y, depth): self.z = depth # self.x = (x - 254.878) * self.z / 365.456 # self.y = (y - 205.395) * self.z / 365.456 self.x = int((2 * sin(pi * self.VFOV / 360) * depth * y) / self.VRES) self.y = int((2 * sin(pi * self.HFOV / 360) * depth * x) / self.HRES) – Lorenzo Sciuto Aug 25 '20 at 12:54

Space represented by a single Kinect pixel at a given depth

1 Answers1