How to find number of detection windows occured in a video for a detection algorithm?

Question

I'm using default HOG detector (hog.detectMultiScale) to detect people in a video. I want to know how many detection windows did actually appear in total for the entire video, because I want to calculate the detection rate and miss rate.

I tried this link but I'm not convinced with the solution they have suggested. They have provided an explanation for the case of images. Does it hold true even in case of videos as well?

Or is it impossible to find the number of detections per frame?

score 0 · Accepted Answer · answered Jul 07 '15 at 20:02

The HOG detector takes in frames from the video, so you can just treat the video as a series of independent images and calculate your precision and recall from those results.

You can find the number of detected people in a given frame by looking at the length of the output Rect array from hog.detectMultiScale.

To find the total number of detection's for the entire video you would just sum the length of the detected results array from each frame.

Recall is the percentage of positive examples that were correctly detected. Which is pretty similar to the hit rate.

However, only looking at the recall or hit rate can be extremely misleading. For example, you could classify every space in the image as a person and you would have a recall and hit rate of 100%. But, that defeats the whole purpose of trying to detect something. Which is why most people also look at precision. Precision is the percentage of your detections that are correctly labeled.

Not all the detections will contain a person. Only looking at the number of detected boxes and the number of people in an image will not give you an accurate measure of hit rate, recall or precision.

Suppose if I have a video of 3 minutes captured at 25 frames per second, should I then calculate for all 4500 frames? Won't that be a cumbersome task to do manually? — 10061990, Jul 08 '15 at 06:05
@10061990 It depends on what you are trying to do. At most I would look a single frame for every second because there is not much motion over the course of a second. However, if you just want to see how well it works in general terms I would just draw the detections on each frame and create a new video and just visually see how well it is working even though I would not know objectively how well it was working. — Trevor Fiez, Jul 09 '15 at 15:18
Thanks you very much. I was struggling in this for past couple of days, now it seems easy. — 10061990, Jul 10 '15 at 05:01

How to find number of detection windows occured in a video for a detection algorithm?

1 Answers1