1

http://opencv-python-tutroals.readthedocs.org/en/latest/py_tutorials/py_ml/py_svm/py_svm_opencv/py_svm_opencv.html

In this example OpenCV gives, the training set is of 250 and testing amount is also the same. However when the testing and training numbers are changed, the accuracy goes down to 0.

# First half is trainData, remaining is testData
train_cells = [ i[:40] for i in cells ] 
test_cells = [ i[40:] for i in cells]

train_amt = 200
responses = np.float32(np.repeat(np.arange(10),train_amt)[:,np.newaxis])

I have changed the values in above lines from the original code. What am I doing wrong? And what should x be?

The training and testing data provided with OpenCV: http://wormassay.googlecode.com/svn/trunk/ThirdParty/OpenCV/samples/python2/data/digits.png

sope
  • 86
  • 2
  • 9

1 Answers1

2

Your data splitting is correct. The reason it gives you 0.0 accuracy is the way you are measuring it.

The accuracy check is done by:

mask = result==responses
correct = np.count_nonzero(mask)
print correct*100.0/result.size

By the new split train/test, this is not correct anymore. For starters result and responses are not of same length, therefore the mask is simply False.

So, now you want to measure accuracy, you need to reshape the responses according to test size, not train. Working code only changes 200 by 300:

responses = np.float32(np.repeat(np.arange(10),300)[:,np.newaxis])
mask = result==responses
correct = np.count_nonzero(mask)
print correct*100.0/result.size

Accuracy goes down a bit, but not to 0.0, now you are at 93.1, which is normal as you reduced size to train and incremented number of tests.

Guiem Bosch
  • 2,728
  • 1
  • 21
  • 37