5

My objective is to train an SVM and get support vectors which i can plug into opencv's HOGdescriptor for object detection.

I have gathered 4000~ positives and 15000~ negatives and I train using the SVM provided by opencv. the results give me too many false positives.(up to 20 per image) I would clip out the false positives and add them into the pool of negatives to retrain. and I would end up with even more false positives at times! I have tried adjusting L2HysThreshold of my hogdescriptor upwards to 300 without significant improvement. is my pool of positives and negatives large enough?

the SVM training is also much faster than expected. I have tried with a feature vector size of 2916 and 12996, using grayscale images and color images on separate tries. SVM training has never taken longer than 20 minutes. I use auto_train. I am new to machine learning but from what i hear training with a dataset as large as mine should take at least a day no?

I believe cvSVM is not doing much learning and according to http://opencv-users.1802565.n2.nabble.com/training-a-HOG-descriptor-td6363437.html, it is not suited for this purpose. does anyone with experience with cvSVM have more input on this?

I am considering using SVMLight http://svmlight.joachims.org/ but it looks like there isn't a way to visualize the SVM hyperplane. What are my options?

I use opencv2.4.3 and have tried the following setsups for hogdescriptor

hog.winSize = cv::Size(100,100);
hog.cellSize = cv::Size(5,5);
hog.blockSize = cv::Size(10,10);
hog.blockStride = cv::Size(5,5); //12996 feature vector

hog.winSize = cv::Size(100,100);
hog.cellSize = cv::Size(10,10);
hog.blockSize = cv::Size(20,20);
hog.blockStride = cv::Size(10,10); //2916 feature vector
tzl
  • 1,540
  • 2
  • 20
  • 31
  • If you're using a descriptor of dimension around 3000 or 10,000, shouldn't you use much more training data? As I recall, a rule of thumb says that the size of the training data should be about 10 times the dimension of the problem. Isn't that correct? – GilLevi Aug 26 '13 at 19:38

2 Answers2

6
  1. Your first descriptor dimension is way too large to be any useful. To form any reliable SVM hyperplane, you need at least the same number of positive and negative samples as your descriptor dimensions. This is because ideally you need separating information in every dimension of the hyperplane.
  2. The number of positive and negative samples should be more or less the same unless you provide your SVM trainer with a bias parameter (may not be available in cvSVM).
  3. There is no guarantee that HOG is a good descriptor for the type of problem you are trying to solve. Can you visually confirm that the object you are trying to detect has a distinct shape with similar orientation in all samples? A single type of flower for example may have a unique shape, however many types of flowers together don't have the same unique shape. A bamboo has a unique shape but may not be distinguishable from other objects easily, or may not have the same orientation in all sample images.
  4. cvSVM is normally not the tool used to train SVMs for OpenCV HOG. Use the binary form of SVMLight (not free for commercial purposes) or libSVM (ok for commercial purposes). Calculate HOGs for all samples using your C++/OpenCV code and write it to a text file in the correct input format for SVMLight/libSVM. Use either of the programs to train a model using linear kernel with the optimal C. Find the optimal C by searching for the best accuracy while changing C in a loop. Calculate the detector vector (a N+1 dimensional vector where N is the dimension of your descriptor) by finding all the support vectors, multiplying alpha values by each corresponding support vector, and then for each dimension adding all the resulting alpha * values to find an ND vector. As the last element add -b where b is the hyperplane bias (you can find it in the model file coming out of SVMLight/libSVM training). Feed this N+1 dimensional detector to HOGDescriptor::setSVMDetector() and use HOGDescriptor::detect() or HOGDescriptor::detectMultiScale() for detection.
Bee
  • 2,472
  • 20
  • 26
  • thanks for the answer. can you elaborate why cvSVM is not the correct tool? I am curious. – tzl Aug 27 '13 at 16:16
  • There is nothing wrong with cvSVM as an SVM tool, but if you are planning to use `HOGDescriptor::detectMultiScale()`, you need a detector vector that cvSVM does not provide. If you use cvSVM, you have to do the training, then for every test image you have to extract the HOG features manually for a sliding window, feed it to cvSVM, get the result, re-scale, do it again. HOGDescriptor pre-calculates some of the HOGs so the sliding window is way more efficient. – Bee Aug 27 '13 at 17:00
  • someone seems to have figured out how to get it out of cvSVM. you can see the answer to this http://stackoverflow.com/questions/15339657/training-custom-svm-to-use-with-hogdescriptor-in-opencv I think my problem is very likely point 3 in your answer. I did not align my objects in the positive images consistently. I shall try with a proper set of positive images with cvSVM first before moving on to SVMLight. thanks for your help – tzl Aug 28 '13 at 12:32
2

I have had successful results using SVMLight to learn SVM models when training from OpenCV, but haven't used cvSVM, so can't compare.

The hogDraw function from http://vision.ucsd.edu/~pdollar/toolbox/doc/index.html will visualise your descriptor.

calumblair
  • 195
  • 6
  • thanks for sharing. how about visualizing the hyperplane(edited to make myself clearer) for SVM? I would like to inspect what kind of kernel type my dataset is more suited for. – tzl Aug 27 '13 at 16:20