libsvm predict method confusion

Question

I have a question about the svm_predict() method in libsvm.

The README has this quickstart example code:

>>> y, x = [1,-1], [{1:1, 3:1}, {1:-1,3:-1}]
>>> prob  = svm_problem(y, x)
>>> param = svm_parameter('-c 4 -b 1')
>>> m = svm_train(prob, param)

>>> p_label, p_acc, p_val = svm_predict(y, x, m)

Now I understand that y is a list of categories that are associated with the dictionaries in x. I also understand the svm_train part.

The part that does not make sense is that in svm_predict, I am required to provide the 'true values' from y, along with the test data in x. I thought the idea was that I do not know the classifications of the test data ahead of time.

if my training data is:

y = [1, 2, 3]
x = [{1:1}, {1:10}, {1:20}]

but my test data is:

z = [{1:4}, {1:12}, {1:19}]

Then why am I required to pass in true values of z into svm_predict() like:

a, b, c = svm_predict(y, z, m)

I'm not going to know the true values for z--that's what the prediction is for. Should I just put arbitrary classification values for y when I perform a prediction, or am I completely missing something?

Thanks all

score 5 · Accepted Answer · answered Dec 27 '10 at 23:01

5

It uses the true labels to give you accuracy statistics in case you are doing an out-of-sample test.

If you are running it "online", i.e. you actually don't have the true labels, then just put [0]*len(z) instead of y

answered Dec 27 '10 at 23:01

Dr G

3,987
2
19
25

score 1 · Answer 2 · answered Dec 28 '10 at 02:59

1

You might consider using

http://scikit-learn.sourceforge.net/

That has a great python binding of libsvm

answered Dec 28 '10 at 02:59

agramfort

71
4

libsvm predict method confusion

2 Answers2