10

I do know there are some libraries that allow to use Support vector Machines from python code, but I am looking specifically for libraries that allow one to teach it online (this is, without having to give it all the data at once).

Are there any?

devoured elysium
  • 101,373
  • 131
  • 340
  • 557

5 Answers5

8

LibSVM includes a python wrapper that works via SWIG.

Example svm-test.py from their distribution:

#!/usr/bin/env python

from svm import *

# a three-class problem
labels = [0, 1, 1, 2]
samples = [[0, 0], [0, 1], [1, 0], [1, 1]]
problem = svm_problem(labels, samples);
size = len(samples)

kernels = [LINEAR, POLY, RBF]
kname = ['linear','polynomial','rbf']

param = svm_parameter(C = 10,nr_weight = 2,weight_label = [1,0],weight = [10,1])
for k in kernels:
    param.kernel_type = k;
    model = svm_model(problem,param)
    errors = 0
    for i in range(size):
        prediction = model.predict(samples[i])
        probability = model.predict_probability
        if (labels[i] != prediction):
            errors = errors + 1
    print "##########################################"
    print " kernel %s: error rate = %d / %d" % (kname[param.kernel_type], errors, size)
    print "##########################################"

param = svm_parameter(kernel_type = RBF, C=10)
model = svm_model(problem, param)
print "##########################################"
print " Decision values of predicting %s" % (samples[0])
print "##########################################"

print "Numer of Classes:", model.get_nr_class()
d = model.predict_values(samples[0])
for i in model.get_labels():
    for j in model.get_labels():
        if j>i:
            print "{%d, %d} = %9.5f" % (i, j, d[i,j])

param = svm_parameter(kernel_type = RBF, C=10, probability = 1)
model = svm_model(problem, param)
pred_label, pred_probability = model.predict_probability(samples[1])
print "##########################################"
print " Probability estimate of predicting %s" % (samples[1])
print "##########################################"
print "predicted class: %d" % (pred_label)
for i in model.get_labels():
    print "prob(label=%d) = %f" % (i, pred_probability[i])

print "##########################################"
print " Precomputed kernels"
print "##########################################"
samples = [[1, 0, 0, 0, 0], [2, 0, 1, 0, 1], [3, 0, 0, 1, 1], [4, 0, 1, 1, 2]]
problem = svm_problem(labels, samples);
param = svm_parameter(kernel_type=PRECOMPUTED,C = 10,nr_weight = 2,weight_label = [1,0],weight = [10,1])
model = svm_model(problem, param)
pred_label = model.predict(samples[0])   
Ryan Cox
  • 4,993
  • 2
  • 25
  • 18
  • Since the LibSVM website or docs doesn't explicitly mention it, I emailed Chih-Jen Lin, asking about incremental/online learning support. His response was, "Unfortunately no. The reason is that we don't see a standard setting for incremental/decremental learning yet." – Cerin Apr 02 '10 at 17:51
4

Haven't heard of one. But do you really need online learning? I'm using SVMs for quite some time and never encountered a problem where i had to use online learning. Usually i set a threshold on the number of changes of training examples (maybe 100 or 1000) and then just batch-retrain all.

If your problem is at a scale, where you absolutely have to use online learning, then you might want to take a look at vowpal wabbit.

Reedited below, after comment:

Olivier Grisel suggested to use a ctypes wrapper around LaSVM. Since i didn't know about LaSVM before and it looks pretty cool, i'm intrigued to try it on my own problems :).

If you're limited to use the Python-VM only (embedded device, robot), i'd suggest to use voted/averaged perceptron, which performs close to a SVM, but is easy to implement and "online" by default.

Just saw that Elefant has some online-SVM code.

ephes
  • 1,451
  • 1
  • 13
  • 19
  • Has anyone been able to build VW on Linux? I have boost installed, but VW seems to assume a much older version. – Cerin Apr 02 '10 at 18:50
  • LaSVM seems to be the winner here. It has a small codebase. Explicitly supports online learning. Compiles easily (tested on Ubuntu 9.10). It doesn't have a direct Python API, but it creates two simple commandline utilities that can easily be called from Python to build the model (la_svm) and use the model (la_test). – Cerin Apr 02 '10 at 19:07
1

Pegasos is an online SVM algorithm that performs quite nicely. It's also fairly easy to implement, even without a specific Python binding. There is a C implementation on the author's website that is adaptable or embeddable as well.

Bryce
  • 2,157
  • 17
  • 16
1

While there are no python bindings there, the algorithm described at http://leon.bottou.org/projects/sgd is trained in an online fashion and is easily reimplemented using e.g. numpy.

etarion
  • 16,935
  • 4
  • 43
  • 66
  • 1
    Stochastic gradient descent (SGD) is also implemented in the scikit-learn (http://scikit-learn.sourceforge.net/). Although on-line fitting of the SGD-based classifiers is not yet exposed, it will be in the next 6 months. – Gael Varoquaux Jan 24 '11 at 20:00
0

Why would you want to train it online? Adding trainings instances would usually require to re-solve the quadratic programming problem associated with the SVM.

A way to handle this is to train a SVM in batch mode, and when new data is available, check if these data points are in the [-1, +1] margin of the hyperplane. If so, retrain the SVM using all the old support vectors, and the new training data that falls in the margin.

Of course, the results can be slightly different compared to batch training on all your data, as some points can be discarded that would be support vectors later on. So again, why do you want to perform online training of you SVM?

Stardust
  • 11
  • 1
  • 1
    I already asnwered your first question above. I'm using it for a reinforcement learning project, thus it needs to learn online. – devoured elysium Nov 29 '09 at 23:33
  • beware that batch mode can be worst than random, if implemented naively. – Davide Dec 06 '09 at 05:58
  • This methodology would not scale. Retraining over the entire data set for each new record would have exponential performance at best. Online learning would have constant or linear performance, depending on the implementation. – Cerin Feb 16 '10 at 16:21