How to use Kinect depth data in watershed image segmentation

Question

I have both RGB and depth images from Kinect as png format. I'm trying to use depth data with the watershed segmentation but I don't know how to combine both data and obtain a more accurate result. I checked some papers but I didn't understand the results or couldn't find a solution written specifically for the watershed algorithm. How can I include the depth data as a reference point to the segmentation process?

I'm using MatLab's Image Processing Toolbox.

The images are from Nathan Silberman et. al.'s database on Silberman's website

An example RGB image and its corresponding depth file are shown below (note that the depth image, originally it is a binary image, is converted to uint8): Depth image (converted to uint8)

Update: I tried to create a weighted grayscale image from the RGB source together with the depth data by taking each channel (red, green, blue and depth) and calculating their weights; then including the values multiplied with their weights for every corresponding pixel. But the resulting grayscale image does not improve the result significantly. It's not that better than the solely RGB based segmentation. What else could I do if I follow this approach? Alternatively, how can I see the effects of the depth data?

Can you include such images ? Depending on the situation, I prefer to use another Meyer's watershed algorithm that takes seed points. — mmgp, Jan 10 '13 at 01:52
Images are added! Yes, Meyer's would be great but i want to try without seed points. — minyatur, Jan 13 '13 at 23:50
You misunderstood me, the seed points would be found automatically to create a marker image, but the watershed implementation has to accept an image along with a marked image. I'm just mentioning about the use of a different implementation. — mmgp, Jan 14 '13 at 00:02
Please clarify what you mean by the original depth image being a binary one. It doesn't make sense to depth be binary, and the included image has levels 0, 88, 98, 104. I can't think of any transformation that would turn {0, 1} into {0, 88, 98, 104}. — mmgp, Jan 14 '13 at 00:07
Here is the watershed of the gradient of what you posted as depth file: http://i.imgur.com/gyPdy.png. I don't see how it is related to the input image, you need to clarify the relation between them. — mmgp, Jan 14 '13 at 00:17
The initial pixel values are 'single' and it displays a black&white image if I do not convert it to uint8. I'm trying to improve the RGB based segmentation by somehow including the depth data. I know it is related because it's the given depth data of that specific RGB image. — minyatur, Jan 14 '13 at 02:47
You applied segmentation only on the depth data, which is irrelevant on its own. What I'm trying to do is a preprocessing on the RGB image with this data before the segmentation step. — minyatur, Jan 14 '13 at 03:04
You are misunderstanding what I'm trying to say. All that your depth contains of relevant has been shown in that earlier image. How are you going to relate it (ignore the segmentation for a moment) to the other image ? You obviously need to know something else that I don't know, I can't relate both images. Therefore, there is nothing to combine with the other image. The "combined segmentation" cannot happen just because you want, you need to explicit how you relate the data. — mmgp, Jan 14 '13 at 03:13
I've downloaded the data set myself and found RGB images and the corresponding depth image (yours aren't corresponding). You will have trouble combining both data in a watershed application, if you can point to some paper that is doing that I would check it. From the site that gives this data set, there is also at least one paper. I went quickly through it and found this at the segmentation step "We also experimented with incorporating edges from depth ... maps, but found them unhelpful, mostly because discontinuities in depth ... are usually manifest as intensity discontinuities.". — mmgp, Jan 14 '13 at 16:58
I'm sorry I can't follow. I assumed in that way because in the raw dataset from the same website, it says that the functions in the toolbox they provide matches the corresponding depth files to the RGB images. Well I have the 'labeled' dataset, which has 3 major parts such as images, depths and labels. So, if the images are not corresponding each other, what should I do? I feel we are moving out of the topic. The main question was how to use 'a' depth data in the RGB segmentation. I mean, I want to understand the concept of including such a data to improve the segmentation results. That's all. — minyatur, Jan 14 '13 at 22:01
There is no such thing in the sense that there are possibly many ways to "include" such data, but the paper itself didn't use it in the segmentation because it didn't help. Do you have any paper that uses this (specifically this) data to do what you are after ? — mmgp, Jan 14 '13 at 22:05
Yes, there is one that I encounter. It is from IEEE's database, the name of the article is: _"RGB-(D) Scene Labeling: Features and Algorithms"_ — minyatur, Jan 14 '13 at 22:14
We must be talking about different things, because this paper doesn't even mention watershed (but it mentions how it considered information from RGB and depth). This paper and the other one actually refer to another one when dealing with the segmentation step, which is "Contour Detection and Hierarchical Image Segmentation". But the method they are mentioning as "Pb" (Posterior probability of a boundary at point (x, y) with orientation theta) is actually defined at "Learning to Detect Natural Image Boundaries Using Local Brightness, Color, and Texture Cues". — mmgp, Jan 14 '13 at 22:37
Yes, you are right. Thank you for your efforts, I will try to do some more research. — minyatur, Jan 14 '13 at 23:19

score 0 · Answer 1 · answered Apr 13 '17 at 06:34

unless you had an error uploading, the depth image is black and doesnt contain any depth data. Keep in mind that (dutch term) your comparing apples and pears here. Whatershed images are not depth images, they are extractions of contour.

Then there is a next thing where you go wrong, depth images have a lower resulotion then color images. for the kinect v2 its only 512x424, and the kinect one's true depth vision is even lower then its returned bitmap size (it is a low res depth and not every pixel in is a result of a measurement, in contrast to kinect v2). But then the v2 has even better video output.

If you want better watershed of a rgb image, then average out multiple camera frames to get rid of camera noise.

PS i recomend you download the windows kinect sdk and take a look of the samples provided with it.

How to use Kinect depth data in watershed image segmentation

1 Answers1