0

The OpenCV documentation states that influence trimming can be used "to reduce the computation time for boosted models with substantially losing accuracy". By default, the weight_trim_rate parameter is 0.95. After disabling influence training by changing that parameter to 0, I actually achieve a large speed-up. When using a dataset with 262144 samples, I achieve a 5x speed-up. When using a dataset ten times larger, I achieve a 3x speed-up. This seems to be the opposite of the expected behavior. Can anyone explain why this might be happening? Thanks!

Some example data is added below. The base case here is when influence trimming is disabled. That gives an accuracy of 95.03 and a train time of 10.607. When influence trimming is turned on (with the default of 0.95), the accuracy drops to 94.94 as expected, but the training time takes 5x as long.

100 weak classifiers with a max depth of 1               
Trim    Accuracy    MSE     Training Time   Percent Speedup 
0       95.03        3.989   10.607 
0.6      7.88       86.77     1.252          8.472044728
0.7     15.76       78.21     2.319          4.573954291
0.8     33.35       57.73    52.972          0.200237862
0.9     94.68        4.89    52.484          0.202099688
0.95    94.94        4.189   52.31           0.202771937
0.99    95.03        3.99    47.026          0.225556075
0.999   95.02        3.985   44.432          0.238724343

Example code:

CvBoost boost;
CvBoostParams boostingParameters;

boostingParameters.boost_type       = CvBoost::REAL;
boostingParameters.weak_count       = 100;
boostingParameters.weight_trim_rate = 0.95;
boostingParameters.max_depth        = 1;
boostingParameters.use_surrogates   = false;
boostingParameters.max_categories   = 2;
boostingParameters.min_sample_count = 100;

boost.train(features, CV_ROW_SAMPLE, responses,
            cv::Mat(),
            cv::Mat(),
            cv::Mat(),
            cv::Mat(),
            boostingParameters,
            false);
Radford Parker
  • 731
  • 4
  • 14

0 Answers0