Possible reasons for OpenCV HaarCascade insuffiecient samples

Question

Info

Im am currently trying to train a HaarCascade classifier. I got to a point where the training process is working and I was able to train a at least "working" classifier. Which will detect a lot of correctly. Now I am trying to improve the results by adding more positive and negative samples.

Idea

As the classifier is working a lot of the time I decided to let it run over a testing dataset and crop out positively classified images. These cropped images were manually cleaned by me. I now want to add these new positive images to my training dataset.

Problem

After starting another training run I ran into the following error:

POS current samples: 166 POS current samples: 167 POS current samples: 168 POS OpenCV Error: Bad argument (Can not get new positive sample. The most possible reason is insufficient count of samples in given vec-file. ) in CvCascadeImageReader::PosReader::get, file D:\cv\opencv_3.2.0\sources_withTextModule\apps\traincascade\imagestorage.cpp, line 158

What I tried so far

I am using the Cascade Trainer GUI (3.3.1) for the training process so I checked the log if the program is setting the parameters to the right values. Especially the count positive image count is definitely correct.
Next I tried to lower my minHitRate even down to 80% still no luck
I had this problem once in the past. I solved it by just removing the added positives again as it was just a small batch size. This would still work but is not an exceptable solution this time.

Cascade Trainer GUI Log

Includes all parameters of the opencv calls. I also shortened the paths to relative to improve readability.

**************************************************
*************** CREATING SAMPLES *****************
**************************************************
Object : project_name/trainingdata
Fixing file names in negative images folder.
Fixing file names in positive images folder.
Creating negative list project_name/trainingdata/neg.lst
Creating positive list project_name/trainingdata/pos.lst
Running : opencv_createsamples
Info file name: project_name\trainingdata\pos.lst
Img file name: (NULL)
Vec file name: project_name\trainingdata\pos_samples.vec
BG  file name: (NULL)
Num: 319
BG color: 0
BG threshold: 80
Invert: FALSE
Max intensity deviation: 40
Max x angle: 1.1
Max y angle: 1.1
Max z angle: 0.5
Show samples: FALSE
Width: 24
Height: 24
Max Scale: -1
Create training samples from images collection...
Done. Created 319 samples

**************************************************
************* TRAINING CLASSIFIER ****************
**************************************************
Running : opencv_traincascade
PARAMETERS:
cascadeDirName: project_name\trainingdata\classifier
vecFileName: project_name\trainingdata\pos_samples.vec
bgFileName: project_name\trainingdata\neg.lst

numPos: 319
numNeg: 1000
numStages: 16
precalcValBufSize[Mb] : 4096
precalcIdxBufSize[Mb] : 4096
acceptanceRatioBreakValue : -1
stageType: BOOST
featureType: HAAR
sampleWidth: 24
sampleHeight: 24
boostType: GAB
minHitRate: 0.995
maxFalseAlarmRate: 0.5
weightTrimRate: 0.95
maxDepth: 1
maxWeakCount: 100
mode: BASIC
Number of unique features given windowSize [24,24] : 162336

===== TRAINING 0-stage =====
<BEGIN

POS current samples: 1
POS current samples: 2
POS current samples: 3

(...) normal training log produced by opencv (stage0 works without any errors)
(...) then failing at stage1    

POS current samples: 167
POS current samples: 168
POS 
OpenCV Error: Bad argument (Can not get new positive sample. The most 
possible reason is insufficient count of samples in given vec-file.
) in CvCascadeImageReader::PosReader::get, file 
D:\cv\opencv_3.2.0\sources_withTextModule\apps\traincascade\imagestorage.cpp, line 158

Additional thoughs

As I this is my first time attempting this I am still experimenting a lot so there are a few things that could be going wrong but I am not really sure so I thought I might add these here to be confirmed by someone who knows this stuff.

I am not resizing my positives nor negatives. Most of them range between 50x50 to 200x200 (positivees) and 200x200 to 500x500 (negatives) is this a problem? I do no resizing because on most tutorials they resize their images and only detect that fixed size after training. My goal is to detect objects of differing sizes.
I do not quite understand how sampleWidth and height and their respective ratio needs to be handeled. I was thinking as I am using images that do not have only a KxK ratio but also KxJ this might be crashing the trainer. But in my original dataset (which worked) includes such ratios.

if you have `minHitRate < 1.0` you will loose positive samples in each stage, since they are already discarded by the current classifier. e.g. if you start with 1000 samples and have a hit rate of 0.997 after stage 1, you have lost 3 samples. In theory you can compute the needed number of positive samples for a given minHitRate and a given numPos, but this will only hold if your samples don't share too much discarded features. Typically, if I choose minHitRate of 0.999 I choose numPos to be 0.9*numberOfAvailablePositiveSamples, which was always quite ok. — Micka, Dec 12 '18 at 19:05
btw, your positive sample images will automatically be resized during creation of the .vec file. Your negative samples will be used in full scale and subimages will be used there, too, since every subimage of a negative sample image is a negative sample, too. — Micka, Dec 12 '18 at 19:14

score 0 · Answer 1 · edited Nov 13 '21 at 04:29

0

You must use positive more than negative image! Your positive images is less than negative images!
To avoid of this error, use minHitRate = 0.999.

edited Nov 13 '21 at 04:29

tdy

36,675
19
86
83

answered Nov 12 '21 at 21:35

A nik

11
3