Object classification with Kinect using cascaded classifiers

Question

My project is to create a software that recognizes certain objects like an apple or a coin etc. I want to use Kinect. My question is: Do I need to have a machine learning algorithm like haar classifier to recognize a object or kinect itself can do that?

Autonomous · Accepted Answer · 2014-03-26T21:51:01.267

4

Kinect itself cannot recognize objects. It will give you a dense depth map. Then you can use the depth features along with some simple features (in your case, maybe color features or gradient features would do the job). Those features you input to a classifier (SVM or Random Forest for example) to train the system. You use the trained model for testing on new samples.

Regarding Haar features, I think they could do the job but you would need a sufficiently large database of features. It all depends on what you want to detect. In the case of an apple and a coin, just color would suffice.

Refer this paper to get an idea how to perform human pose recognition using Kinect camera. You just have to pay attention to their depth features and their classifiers. Do not apply their approach directly. Your problem is simpler.

Edit: simple gradient orientations histogram

Gradient orientations can give you a coarse idea about the shape of the object (It is not a shape-feature to be specific, better shape features exist, but this one is extremely fast to calculate).

Code snippet:

%calculate gradient
[dx,dy] = gradient(double(img));
A = (atan(dy./(dx+eps))*180)/pi;   %eps added to avoid division by zero.

A will contain orientation for each pixel. Segment your original image according to the depth values. For a segment having similar depth values, calculate color histogram. Extract the pixel orientations corresponding to that region, call it A_r. calculate a 9-bin (you can have more bins. Nine bins mean each bin will contain 180/9=20 degrees) histogram. Concatenate the color features and the gradient histogram. Do this for sufficient number of leaves. Then you can give this to a classifier for training.

Edit: This is a reply to a comment below.

Regarding MaxDepth parameter in opencv_traincascade

The documentation says, "Maximal depth of a weak tree. A decent choice is 1, that is case of stumps". When you perform binary classification, it takes a form of:

if yourFeatureValue>=learntThresh
   class=1;
else
   class=0;
end

The above type of classifier which performs thresholding on a single feature value (a scalar) is called decision stumps. There is only one split between positive and negative class (therefore maxDepth is one). For example, it would work in following scenario. Imagine you have a 1-D feature:

f=[1 2 3 4 -1 -2 -3 -4]

First 4 are class 1, rest are class 0. Decision stumps would get 100% accuracy on this data by setting the threshold to zero. Now, imagine a complicated feature space such as:

f=[1 2 3 4 5 6 7 8 9 10 11 12];

First 4 and last 4 are class 1, rest are class 0. Here, you cannot get 100% classification by decision stumps. You need two thresholds/splits. Therefore, you can construct a tree with depth value 2. You will have 2^(2-1)=2 thresholds. For depth=3, you get 4 thresholds, for depth=4, you get 8 thresholds and so on. Here, I assume a tree with a single node has height 1.

You may feel that the more the number of levels, you can achieve more accuracy, but then there is a problem of overfitting (and computation, memory storage etc.). Therefore, you have to set a good value for depth. I usually set it to 3.

edited Mar 26 '14 at 21:51

answered Feb 12 '14 at 22:56

Autonomous

8,935
1
38
77

First!! In my case I want to recognize a kind of leaf call Saman. I guess that features that I need are: size and color when leaf are green, yellow and brown when leaf is dead!! I gues taht I need a large database. Do you think that haar classifier is a good idea? or Do you have better options? – user2676907 Feb 12 '14 at 23:16
You said it. Simple color features are a better idea. Look for color histograms. By the way, why do you want to use kinect for leaf classification? – Autonomous Feb 12 '14 at 23:33
Nothing for now.That is my project. The project is that someone show a leaf in front of kinect and Kinect decides that the object is a leaf or not and if object is a leaf decides if the object is a leaf of Saman or not.Later I will make a prototype that can do something with those objects!! And for that reason I want to Know if you think that haar classifier is a good idea or you think that there are better option!! What do you think? – user2676907 Feb 12 '14 at 23:45
@user2676907 I would say, no need to use haar features. Use depth features along with color features (for example, color histogram). – Autonomous Feb 13 '14 at 01:36
ohh yaa...you say that I dont have to train with haar classifier instand only with depth features and color features I can recognize the leaf rigth?? – user2676907 Feb 13 '14 at 04:10
But the colors are sensitive to iluminations and I think that is different recognize a leaf in the morning than in the nigth..I guess – user2676907 Feb 13 '14 at 04:18
Two options. 1. Use enough training data to include as many cases as possible. 2. Use gradient histogram features to model the shape of the leaf. – Autonomous Feb 13 '14 at 15:24
mmm can you explain more about that gradient histogram? or do you have any link about that?? – user2676907 Feb 13 '14 at 22:04
@user2676907 I have changed the title to suit your question. You are free to change it back if you don't agree with it. – Autonomous Feb 13 '14 at 23:22
So it is very difficult to recolect one kind of leaf because it have a lot of different size!!! Saman leaf has different size and it is very diffcult to recolect all of them!!Do you have any suggestions? – user2676907 Feb 14 '14 at 20:37
An another question if I decide to use haar clasifiers and I must take a lot of photo of object that I want to recognize but If I show one object that is very similiar 70% or 80% of the photos that I gave to training the machine, the machine recognize it is a leaf or is mandatory that leaf that I show to machine must be exactly of photos that I gave to training the machine – user2676907 Feb 14 '14 at 21:00
It is not mandatory. If it is 70% similar then most likely your features will be similar and then SVM classifier (or any other good classifier) should be able to handle it. – Autonomous Feb 15 '14 at 02:34
Well..And are there any parameters that I can set in order to say harr classifier than I want a certain matching percentage I dont know if minhitrate is the parameter thant I can set! – user2676907 Feb 15 '14 at 04:11
and also I want to tell you that leaf have about 5cm or less..does that size affect the trainning?? – user2676907 Feb 15 '14 at 04:18
What tool are you going to use, if OpenCV, then it doesn't seem to have that parameter, but in theory you can set that. As you increase number of weak classifiers, error rate should go down (or remain same, if it can't decrease). You have to read up adaboost, cascade classification for that. I would recommend "Robust Real-time Face Detection" from Viola-Jones to know how these concepts are applied. For your 2nd question, whatever statistics you know, you should encode them in your features. In your case, you can add size of leaf as one feature. – Autonomous Feb 15 '14 at 04:41
I understand!!! I think that leaves have different color: green, yellow and brown!! If I care color of leaf, I will take a lot of image for those color. My idea is taking photo of leaf size instand of color because I dont care color. My question is: If I dont care about color I can work in gray-scale. For example I will convert positives and negatives images in gray-scale and train my machine learning with that and I will convert frames of webcam in gray-scale too..Is that a good idea?? – user2676907 Feb 15 '14 at 05:20
Yes. You can use gray-scale images. – Autonomous Feb 15 '14 at 05:49
You help me a lot!!In other hand! I want to compare haar classifier with SVM, Well I guess that I need possitive and negatives images that I took from previous machine learning but How can I convert that image into input to vsm? – user2676907 Feb 15 '14 at 06:21
Heyy I have for now feww positive image to Harr classifier and I have a question!!Is there any problem if I eliminate background of positives images and leave only object that my sotfware will recognize? – user2676907 Feb 17 '14 at 02:05
Heyy Can you explain me what maxdepth parameter is in opencv_traincascade?? – user2676907 Mar 26 '14 at 20:38
Ok I understood. And is there any way to know how is the best value of depthmax? or Should I try and see what happen with different values of amx depth? – user2676907 Mar 26 '14 at 22:46
You can try different values. You should go beyond 10. 2^10=1024. You are overfitting as you are increasing levels. – Autonomous Mar 26 '14 at 22:49
10? don't you think that 10 is too much?? – user2676907 Mar 26 '14 at 22:52
It is. That's why I said, don't go beyond 10. Even 5 is fine. – Autonomous Mar 26 '14 at 22:53
OK Can you explain me about maxWeakCount parameters. I understand that is the max number that is used to archieve the maxfalseAlarmrate. I want to know more about that and What is the problem if a increase that value? – user2676907 Mar 26 '14 at 22:57
Put a different question. I cannot extend this thread any longer. – Autonomous Mar 26 '14 at 23:04
But you can can reply another answser not?? – user2676907 Mar 26 '14 at 23:10
Another user ask question about maxdepth and maxweakcount. Can you answer that. becauase I can not ask the same question. http://stackoverflow.com/questions/22675234/an-advice-about-parameters-in-opencv-traincascade – user2676907 Mar 27 '14 at 00:10

Object classification with Kinect using cascaded classifiers

1 Answers1