How to convert the depth data from a Kinect 2.0 to a distance value?

Question

I would like to take advantage from the depth sensor of the Kinect 2.0 SDK, but not in the sense that this data is drawn or displayed in the format of an image but rather an integer or something alike. An example of that is if I have my hand very close to the Kinect, I would get an integer value telling me approximately the range between the camera and the obstacle. maybe something like this. As the obstacle moves the Kinect recalculates the distance and updates maybe every secnond or half a second.

The distance between the kinect and the obstacle is 20 cm
The distance between the kinect and the obstacle is 10 cm 
The distance between the kinect and the obstacle is 100 cm

Is that possible? I searched for tutorials, but al I could find is that the representation is usually using a point cloud or black and white depth image.

The "black and white depth image" presumably gives you the distance between the Kinect and some other object encoded as the lightness of a pixel. Can you explain why this information is inadequate for your purpose? — Angew is no longer proud of SO, May 03 '16 at 10:12
I did not know how to move that data from an image data to just a distance from the kinect and only ONE obstacle? Any tutorial or sample code would be helpful. Thanks — user1680944, May 03 '16 at 10:15
Recommending tutorials is off-topic on (main) SO. You should try finding something yourself, following it, researching any problems you may encounter, and then come back with specific questions if they are still not resolved. You might be able to ask for some tutorial pointers etc. in a relevant chat, not sure. You might want to look into image processing or computer vision for converting a point cloud/depth image into a meaningful sceen representation. — Angew is no longer proud of SO, May 03 '16 at 10:19
Potential avenue of research: If you *know* there's only one object in the scene (i.e. the depth image is just object or background), you should be able to use a *segmentation* algorithm to extract the object's area from the depth image (even basic thresholding might work). Then, convert the colour values from that area (perhaps the minimum, perhaps a mean) into distance. — Angew is no longer proud of SO, May 03 '16 at 10:21
Which version of the Microsoft Kinect SDK are you using? (i.e. which version of Kinect device are you using? [Kinect for Xbox One](http://compass.xbox.com/assets/3d/37/3d377852-0f21-4074-a3c2-35f418170848.jpg?n=chandler_xboxone_hardware_960x540_01.jpg) or [Kinect for Xbox 360](http://compass.xbox.com/assets/89/91/8991d7b5-c14f-4b30-9b89-deb3ba52069c.jpg?n=Xbox360_Sensor_960x450.jpg)?) — Vito Gentile, May 03 '16 at 13:07

score 2 · Answer 1 · answered May 05 '16 at 17:27

The Kinect does indeed use a point cloud. And each pixel in the point cloud has an unsigned short integer value that represents the distance in mm from the Kinect. From what I can tell the camera is already able to compensate or objects to the sides of its view being further away then whats in front, so all the data represents the depth those objects are away from the plane the Kinects camera is viewing from. The Kinect will view up to about 8m but only has reliable data between 4-5m and can't see anything closer than 0.5m. Since it sounds like your using it as an obstacle detection system I'd suggest monitoring all the data points in the point cloud and averaging out grouped data points that stand apart from others and interpreting them as their own objects with an approximate distance away. The Kinect also updates with 30 frames per second (assuming your hardware can handle the intensive data stream). So you'll simply be constantly monitoring your point cloud for the objects changing distance.

If you start by downloading both the SDK and Kinect Studio you can use the depth/IR programming examples and the studio to get a better understanding of how the data can be used.

score 2 · Accepted Answer · answered Sep 05 '17 at 09:54

Even though this question has been asked some time ago and the person questioning most likely solved this by his own, I just wanted to give everyone else who might have the same problem/ question the C++ code to solve this problem.

Note: the solution is based on the Kinect 2.0 SDK and if I remember correctly, I took it from one of the examples provided in the SDK Browser 2.0, which comes with the Kinect 2.0 SDK. I removed all special modifications I've done and just left the most important aspects (so, you most likely have to modify the void and give it a return parameter of some kind).

The m_pDepthFrameReader is as initialized IDepthFrameReader*

void KinectFusionProcessor::getDistance() {
IDepthFrame* frame = NULL;
if (SUCCEEDED(m_pDepthFrameReader->AcquireLatestFrame(&frame))) {
    int width = 512;
    int height = 424;
    unsigned int sz=(512*424);
    unsigned short buf[525 * 424] = {0};
    frame->CopyFrameDataToArray(sz,buf);

    const unsigned short* curr = (const unsigned short*)buf;
    const unsigned short* dataEnd = curr + (width*height);

    while (curr < dataEnd) {
        // Get depth in millimeters
        unsigned short depth = (*curr++);
    }
}
if (frame) frame->Release();
}

score 1 · Answer 3 · answered May 04 '16 at 14:48

This is pretty straightforward. I do not know the exact code in C++ but in C#, once you have the depth frame you need to do the following: I'll assume that you already know that Y and X point where you want to evaluate the depth value.

Once you know that, you need to first convert each byte of the depth frame into ushort.

After that, you need to calculate the index inside the depthPixels that corresponds to your X and Y point. Typically this is the equation used:

// Get the depth for this pixel
ushort depth = frameData[y * depthFrameDescription.Height + x];

I hope this can be helpful.

I believe it's actually ' (y * width) + x ' if I'm not mistaken. — Oliver Jones, May 16 '16 at 18:44

How to convert the depth data from a Kinect 2.0 to a distance value?

3 Answers3