getUserPixels - alternative in official Kinect SDK

Question

Is there an alternative for the getUserPixels method offered by OpenNI in the official Kinect SDK?

How would one implement this functionality with the official Kinect SDK?

If I understand the use of this function, it appears to extract user silhouettes. While there is no direct function call in the official SDK, see http://stackoverflow.com/questions/13796301/is-it-possible-to-extract-the-players-depth-pixels-only-out-of-the-depth-bitma for information on extracting player silhouettes. Please correct me if I'm not correctly grasping the function and goal. — Nicholas Pappas, Dec 12 '12 at 19:39
Hi Evil Closet Monkey :) I am looking for background removal. Using the depth data only you would come accross the problem of other objects being at the same depth as the primary user (using the Kinect SDk). The OpenNI library tells you which depth pixels belong to the user and which do not. Correct me if I am wrong regarding the "objects being at the same depth" problem — oneiros, Dec 12 '12 at 21:04
Using purely a depth value you'd be correct, however the depth data does include player masks which is only data corresponding to a given user. This can be used to create a silhouette, or mapped to the color data to create a "green screen" effect (aka: background removal). I'll include some links in an answer, for better formatting... — Nicholas Pappas, Dec 12 '12 at 21:14

score 1 · Accepted Answer · answered Dec 12 '12 at 21:44

The official Kinect for Windows SDK (v1.6) does not support a direct call, such as getUserPixels, to extract a player silhouette but does contain all the information necessary to do so.

You can see this in action, in different ways, by examining two of the examples available from the Kinect for Windows Developer Toolkit.

Basic Interactions-WPF: includes a function to create a simple silhouette of the user being tracked.
Green Screen (-WPF, or -D2D): shows how to perform background subtraction to produce a green screen effect. In this example the data from the RGB camera is superimposed over a image.

The two examples do this in different ways.

Basic Interactions will pull out a BitmapMask of from the depth data which corresponds to the requested player. This has the advantage of only showing tracked users; any object not thought to be a skeleton is ignored.
Green Screen does not look for a particular user, instead opting for motion. This gives the advantage silhouetting any moving object -- such as a ball being passed between two users.

I believe the "Basic Interactions" example will show you how you implement what you are looking for. You'll have to do the work yourself, but it is possible. For example, using the "Basic Interactions" example as a base I created a UserControl that generates a simple silhouette of the user being tracked...

When the skeleton frame is ready, I pull out the player index:

private void OnSkeletonFrameReady(object sender, SkeletonFrameReadyEventArgs e)
{
    using (SkeletonFrame skeletonFrame = e.OpenSkeletonFrame())
    {
        if (skeletonFrame != null && skeletonFrame.SkeletonArrayLength > 0)
        {
            if (_skeletons == null || _skeletons.Length != skeletonFrame.SkeletonArrayLength)
            {
                _skeletons = new Skeleton[skeletonFrame.SkeletonArrayLength];
            }

            skeletonFrame.CopySkeletonDataTo(_skeletons);

            // grab the tracked skeleton and set the playerIndex for use pulling
            // the depth data out for the silhouette.
            // NOTE: this assumes only a single tracked skeleton!
            this.playerIndex = -1;
            for (int i = 0; i < _skeletons.Length; i++)
            {
                if (_skeletons[i].TrackingState != SkeletonTrackingState.NotTracked)
                {
                    this.playerIndex = i+1;
                }
            }
        }
    }
}

Then, when the next depth frame is ready, I pull out BitmapMask for the user that corresponds to playerIndex.

private void OnDepthFrameReady(object sender, DepthImageFrameReadyEventArgs e)
{
    using (DepthImageFrame depthFrame = e.OpenDepthImageFrame())
    {
        if (depthFrame != null)
        {
            // check if the format has changed.
            bool haveNewFormat = this.lastImageFormat != depthFrame.Format;

            if (haveNewFormat)
            {
                this.pixelData = new short[depthFrame.PixelDataLength];
                this.depthFrame32 = new byte[depthFrame.Width * depthFrame.Height * Bgra32BytesPerPixel];
                this.convertedDepthBits = new byte[this.depthFrame32.Length];
            }

            depthFrame.CopyPixelDataTo(this.pixelData);

            for (int i16 = 0, i32 = 0; i16 < pixelData.Length && i32 < depthFrame32.Length; i16++, i32 += 4)
            {
                int player = pixelData[i16] & DepthImageFrame.PlayerIndexBitmask;
                if (player == this.playerIndex)
                {
                    convertedDepthBits[i32 + RedIndex] = 0x44;
                    convertedDepthBits[i32 + GreenIndex] = 0x23;
                    convertedDepthBits[i32 + BlueIndex] = 0x59;
                    convertedDepthBits[i32 + 3] = 0x66;
                }
                else if (player > 0)
                {
                    convertedDepthBits[i32 + RedIndex] = 0xBC;
                    convertedDepthBits[i32 + GreenIndex] = 0xBE;
                    convertedDepthBits[i32 + BlueIndex] = 0xC0;
                    convertedDepthBits[i32 + 3] = 0x66;
                }
                else
                {
                    convertedDepthBits[i32 + RedIndex] = 0x0;
                    convertedDepthBits[i32 + GreenIndex] = 0x0;
                    convertedDepthBits[i32 + BlueIndex] = 0x0;
                    convertedDepthBits[i32 + 3] = 0x0;
                }
            }

            if (silhouette == null || haveNewFormat)
            {
                silhouette = new WriteableBitmap(
                    depthFrame.Width,
                    depthFrame.Height,
                    96,
                    96,
                    PixelFormats.Bgra32,
                    null);

                SilhouetteImage.Source = silhouette;
            }

            silhouette.WritePixels(
                new Int32Rect(0, 0, depthFrame.Width, depthFrame.Height),
                convertedDepthBits,
                depthFrame.Width * Bgra32BytesPerPixel,
                0);

            Silhouette = silhouette;

            this.lastImageFormat = depthFrame.Format;
        }
    }
}

What I end up with is a purple silhouette of the user in a WriteableBitmap, which can be copied to an Image on the control or pulled and used elsewhere. Once you have the BitmapMask you could also map the data the color stream if you wanted a to actually see the RGB data that corresponds to that area.

You can adapt the code to simulate more closely the getUserPixels function if you like. The big part you'd be interested in would be, given a depth frame and a playerIndex:

if (depthFrame != null)
{
    // check if the format has changed.
    bool haveNewFormat = this.lastImageFormat != depthFrame.Format;

    if (haveNewFormat)
    {
        this.pixelData = new short[depthFrame.PixelDataLength];
        this.depthFrame32 = new byte[depthFrame.Width * depthFrame.Height * Bgra32BytesPerPixel];
        this.convertedDepthBits = new byte[this.depthFrame32.Length];
    }

    depthFrame.CopyPixelDataTo(this.pixelData);

    for (int i16 = 0, i32 = 0; i16 < pixelData.Length && i32 < depthFrame32.Length; i16++, i32 += 4)
    {
        int player = pixelData[i16] & DepthImageFrame.PlayerIndexBitmask;
        if (player == this.playerIndex)
        {
            // this pixel "belongs" to the user identified in "playerIndex"
        }
        else
        {
            // not the requested user
        }
    }
}

Evil Closet Monkey - I have a quick question for you. You are saying that the GreenScreen-WPF application from the official Kinect SDK 1.6 release relies on motion. I honestly do not think that this is true. If you look at the code you will see the following DepthImagePixel depthPixel = this.depthPixels[depthIndex]; int player = depthPixel.PlayerIndex; // if we're tracking a player for the current pixel, do green screen if (player > 0) From this you can see that they are basically making a decision on whether a given depth pixel is a player's pixel. — oneiros, Feb 01 '13 at 23:48

getUserPixels - alternative in official Kinect SDK

1 Answers1