scikit learn SVM, how to save/load support vectors?

Question

using python scikit svm, after running clf.fit(X, Y), you get your support vectors. could I load these support vectors directly (passing them as paramter) when instantiate a svm.SVC object? which means I do not need to running fit() method each time to do predication

Possible duplicate http://stackoverflow.com/questions/11440970/how-can-i-save-a-libsvm-python-object-instance — Pedrom, Mar 22 '13 at 11:46

score 23 · Accepted Answer · edited Apr 27 '16 at 11:40

From the scikit manual: http://scikit-learn.org/stable/modules/model_persistence.html

1.2.4 Model persistence It is possible to save a model in the scikit by using Python’s built-in persistence model, namely pickle.

>>> from sklearn import svm
>>> from sklearn import datasets
>>> clf = svm.SVC()
>>> iris = datasets.load_iris()
>>> X, y = iris.data, iris.target
>>> clf.fit(X, y)
SVC(kernel=’rbf’, C=1.0, probability=False, degree=3, coef0=0.0, eps=0.001,
cache_size=100.0, shrinking=True, gamma=0.00666666666667)
>>> import pickle
>>> s = pickle.dumps(clf)
>>> clf2 = pickle.loads(s)
>>> clf2.predict(X[0])
array([ 0.])
>>> y[0]
0

In the specific case of the scikit, it may be more interesting to use joblib’s replacement of pickle, which is more efficient on big data, but can only pickle to the disk and not to a string:

>>> from sklearn.externals import joblib
>>> joblib.dump(clf, ’filename.pkl’)

The link is broken. Use this instead: http://scikit-learn.org/stable/modules/model_persistence.html — Tommz, May 22 '15 at 18:08
Note that with pickle you tie yourself to a specific scikit version, it is not a good solution for long-term storage of models. — Adversus, Apr 14 '16 at 08:08

score 3 · Answer 2 · answered May 10 '13 at 13:55

3

You can save the model in order to use it later. I wrote the code below to use the model when there exists one that I fitted and saved before.

from sklearn.externals import joblib
svm_linear_estimator = svm.SVC(kernel='linear', probability=False, C=1)
try:
    estimator = joblib.load("/my_models/%s.pkl"%dataset_name)
    print "using trained model"
except:
    print "building new model"
    estimator.fit(data_train, class_train)
    joblib.dump(estimator,"/my_models/%s.pkl"%dataset_name)

answered May 10 '13 at 13:55

Bilal Dadanlar

820
7
14

when you save the trained model, it can create more than one file. but you still call it with "dataset_name.pkl" name. And variable estimator above should have been svm_linear_estimator. – Bilal Dadanlar May 13 '13 at 08:53
1

i just realized that os.path.exists() is smarter than using try catch :) – Bilal Dadanlar May 21 '13 at 14:30

scikit learn SVM, how to save/load support vectors?

2 Answers2