0

I have a few doubts about how to approach my goal. I have an outside camera who is recording people and I want to draw an ellipse on every person.

Right now what I do is get the feature points of the people from the frame (I get them using a mask to only have the feature points on the people), set a EM algorithm and train it with my samples (the feature points extracted). The number of clusters is twice the number of people from the image (I get it before start the EM algorithm using other methods such as pixel counting with a codebook).

My question is

  • (a) Do I have to just train it only for the first frame and then use predict in the following frames? or,
  • (b) use train with the feature points in every frame?

Right now I am doing the option b) (I don't use predict) because I don't really know how to use the predict.

If I do a), can you help me with it and after that how to draw the ellipses?. If I do b), can you help me drawing an ellipse for every person? Since right know I got different ellipses for the same person using the cov, mean, etc (one for the arm, for example).

What I want to achieve is this paper using the Gaussian model: Link

Glorfindel
  • 21,988
  • 13
  • 81
  • 109
  • 2
    If you would draw bounding boxes, rather then ellipses, you could use the function groupRectanlges to merge the different bounding boxes. But, more important - why don't you use openCV's person detector (based on HOG) or latent svm detector with the person model? – GilLevi Oct 07 '13 at 16:02
  • Thanks for your reply. I don't really know what those terms are, but I will definitely look into it. The point of using what I explained above is because I will be detecting people in very crowd environments (such as Malls, Time Square, etc.) and after getting the feature points I will just draw vertical ellipses passing over those feature points. I will come back if I don't understand what you stated, thanks – antonio escudero Oct 07 '13 at 19:51
  • OK, so I'll right my comment as an answer.. – GilLevi Oct 07 '13 at 21:29
  • Well, it is an answer to see other options but not for what I asked, no offense, hehe. I need someone to really help me with this. @GilLevi, I already look into svm detector but it is very slow for a live video. The HOG one could be better, but in my video the people are very small (around 20x25 px) and Hog won't detect them, that is why I am using the feature points and then cluster those points with EM, but I don't really know how to finish it. Thanks for the help – antonio escudero Oct 07 '13 at 22:26
  • Try to upsample the images and then apply the HOG detector. – GilLevi Oct 08 '13 at 08:29

2 Answers2

1

If you would draw bounding boxes, rather then ellipses, you could use the function groupRectanlges to merge the different bounding boxes.

But, more important - for people detection, you can simply use openCV's person detector (based on HOG) or latent svm detector with the person model.

GilLevi
  • 2,117
  • 5
  • 22
  • 38
0

You should do b) anyway because, otherwise you'll try to match the keypoints to the clusters (persons) in the first frame. After a few seconds this would not be relevant.

It seems reasonable to assume that from frame to frame change is not going to be overwhelming, so reusing the results of the training on frame N-1 is a good seed to train on frame N, likely to converge faster that running EM from scratch on each frame.

in order to draw the ellipses you can leverage from the mixture of gaussian example in the python bindings:

https://github.com/opencv/opencv/blob/master/samples/python/gaussian_mix.py

Note if you use a diagonal covariance matrix, your ellipses are going to be aligned "straight", their own axis aligned with the X and Y axis of the frame, you can skip the calculation of the angle of the ellipse