It's good you're aligning the RGB and depth streams.
There are few things that could be improved in terms of efficiency:
No need to reload a black image every single frame (in the draw() loop) since you're modifying all the pixels anyway:
mask = loadImage("black640.jpg"); //just a black image
Also, since you don't need the x,y coordinates as you loop through the user data, you can use a single for loop which should be a bit faster:
for(int i = 0 ; i < numPixels ; i++){
mask.pixels[i] = userMap[i] > 0 ? color(255) : color(0);
}
instead of:
for (int y = 0; y < ySize; y++) {
for (int x = 0; x < xSize; x++) {
int index = x + y*xSize;
if (userMap[index]>0) {
mask.pixels[index]=color(255, 255, 255);
}
}
}
Another hacky thing you could do is retrieve the userImage()
from SimpleOpenNI, instead of the userData()
and apply a THRESHOLD
filter to it, which in theory should give you the same result as above.
For example:
int[] userMap = context.userMap();
background(0, 0, 0);
mask = loadImage("black640.jpg"); //just a black image
int xSize = context.depthWidth();
int ySize = context.depthHeight();
mask.loadPixels();
for (int y = 0; y < ySize; y++) {
for (int x = 0; x < xSize; x++) {
int index = x + y*xSize;
if (userMap[index]>0) {
mask.pixels[index]=color(255, 255, 255);
}
}
}
could be:
mask = context.userImage();
mask.filter(THRESHOLD);
In terms of filtering, if you want to shrink the silhouette you should ERODE
and bluring should give you a bit of that Photoshop like feathering.
Note that some filter() calls take arguments (like BLUR
), but others don't like the ERODE
/DILATE
morphological filters, but you can still roll your own loops to deal with that.
I also recommend having some sort of easy to tweak interface (it can be fancy slider or a simple keyboard shortcut) when playing with filters.
Here's a rough attempt at the refactored sketch with the above comments:
import SimpleOpenNI.*;
SimpleOpenNI context;
PImage mask;
int numPixels = 640*480;
int dilateAmt = 1;
int erodeAmt = 1;
int blurAmt = 0;
void setup()
{
size(640*2, 480);
context = new SimpleOpenNI(this);
if (context.isInit() == false)
{
exit();
return;
}
context.enableDepth();
context.enableRGB();
context.enableUser();
context.alternativeViewPointDepthToImage();
mask = createImage(640,480,RGB);
}
void draw()
{
frame.setTitle(int(frameRate) + " fps");
context.update();
int[] userMap = context.userMap();
background(0, 0, 0);
//you don't need to keep reloading the image every single frame since you're updating all the pixels bellow anyway
// mask = loadImage("black640.jpg"); //just a black image
// mask.loadPixels();
// int xSize = context.depthWidth();
// int ySize = context.depthHeight();
// for (int y = 0; y < ySize; y++) {
// for (int x = 0; x < xSize; x++) {
// int index = x + y*xSize;
// if (userMap[index]>0) {
// mask.pixels[index]=color(255, 255, 255);
// }
// }
// }
//a single loop is usually faster than a nested loop and you don't need the x,y coordinates anyway
for(int i = 0 ; i < numPixels ; i++){
mask.pixels[i] = userMap[i] > 0 ? color(255) : color(0);
}
//erode
for(int i = 0 ; i < erodeAmt ; i++) mask.filter(ERODE);
//dilate
for(int i = 0 ; i < dilateAmt; i++) mask.filter(DILATE);
//blur
mask.filter(BLUR,blurAmt);
mask.updatePixels();
//preview the mask after you process it
image(mask, 0, 0);
PImage rgb = context.rgbImage();
rgb.mask(mask);
image(rgb, context.depthWidth() + 10, 0);
//print filter values for debugging purposes
fill(255);
text("erodeAmt: " + erodeAmt + "\tdilateAmt: " + dilateAmt + "\tblurAmt: " + blurAmt,15,15);
}
void keyPressed(){
if(key == 'e') erodeAmt--;
if(key == 'E') erodeAmt++;
if(key == 'd') dilateAmt--;
if(key == 'D') dilateAmt++;
if(key == 'b') blurAmt--;
if(key == 'B') blurAmt++;
//constrain values
if(erodeAmt < 0) erodeAmt = 0;
if(dilateAmt < 0) dilateAmt = 0;
if(blurAmt < 0) blurAmt = 0;
}
Unfortunately I can't test with an actual sensor right now, so please use the concepts explained, but bare in mind the full sketch code isn't tested.
This above sketch (if it runs) should allow you to use keys to control the filter parameters (e/E to decrease/increase erosion, d/D for dilation, b/B for blur). Hopefully you'll get satisfactory results.
When working with SimpleOpenNI in general I advise recording an .oni file (check out the RecorderPlay example for that) of a person for the most common use case. This will save you some time on the long run when testing and will allow you to work remotely with the sensor detached. One thing to bare in mind, the depth resolution is reduced to half on recordings (but using a usingRecording
boolean flag should keep things safe)
The last and probably most important point is about the quality of the end result. Your resulting image can't be that much better if the source image isn't easy to work with to begin with. The depth data from the original Kinect sensor isn't great. The Asus sensors feel a wee bit more stable, but still the difference is negligible in most cases. If you are going to stick to one of these sensors, make sure you've got a clear background and decent lighting (without too much direct warm light (sunlight, incandescent lightbulbs, etc.) since they may interfere with the sensor)
If you want a more accurate user cut and the above filtering doesn't get the results you're after, consider switching to a better sensor like KinectV2. The depth quality is much better and the sensor is less susceptible to direct warm light. This may mean you need to use Windows (I see there's a KinectPV2 wrapper available) or OpenFrameworks(c++ collections of libraries similar to Processing) with ofxKinectV2