3

I have 4 labelled groups which I want to classify using SVM.

Class-A, Class-B, Class-C, Class-D

Now If I need to train my classifier to recognize I will copy all the text from A,B,C,D into a file "A-against-all". SImilarly for B,C & D as

"B-against-all" CLass B :1 , Rest all :-1
"C-against-all" CLass C :1 , Rest all :-1
"D-against-all" CLass D :1 , Rest all :-1

Now if I run SVM on "A-against-all" then I get a classifier as output. Similarly I get three more classifiers for B,C & D.

Now my questions is this : - How do I integrate these 4 classifiers so as to operate in unison ?

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
  • 1
    I guess the OP meant to say that the file "B-against-all" will have class-B as 1 and rest all as -1, similarly "C-against-all" will have class-C as 1 and rest all as -1 –  Nov 29 '14 at 13:51

1 Answers1

1

I don't know how to merge 2 or more SVM classifiers into 1. But for your particular problem you can get a desired SVM by creating a file containing all data of A, B, C & D with separate labels say 1,2,3 & 4 respectively for A, B, C & D. Run SVM on this combined file and and the generated classifier will correctly identify a data point to be of Class-A, Class-B, Class-C or Class-D.

SVM on iris data. Iris data has three classes 0,1,2. 0-50 are class-0, 50-100 are class-1 and 100-150 are class-2.

>>> from sklearn import datasets as DS
>>> iris = DS.load_iris()
>>> from sklearn import svm
>>> clf=svm.SVC()
>>> clf.fit(iris.data,iris.target)
SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0, degree=3, gamma=0.0,
  kernel='rbf', max_iter=-1, probability=False, random_state=None,
  shrinking=True, tol=0.001, verbose=False)
>>> clf.predict(iris.data[25])
array([0])
>>> clf.predict(iris.data[75])
array([1])
>>> clf.predict(iris.data[125])
array([2])

As you can see SVM has classified data with three class labels and predicted accurately.

Have a look at this question: Prepare data for text classification using Scikit Learn SVM. This is what you need to do.

Community
  • 1
  • 1
Irshad Bhat
  • 8,479
  • 1
  • 26
  • 36
  • But doesnt SVM operate on binary classification only ? Are you sure it will work? –  Nov 29 '14 at 16:27
  • SVM doesn't operate on binary classification only. It can classify with any number of classes. Try it this way. It will work definitely. – Irshad Bhat Nov 29 '14 at 18:04
  • I've uploaded code where SVM accurately classifies data with three classes. Check it. – Irshad Bhat Nov 29 '14 at 18:12
  • 1
    @rzach you should also refer http://scikit-learn.org/stable/datasets/ as suggested by bhat –  Nov 30 '14 at 04:36
  • @BHATIRSHAD thanks a lot. But how do I prepare my data for svm. Actually my data is text data. Classes : "pharma","retail","food&travel". I have 100 text files for each of these classes. Now how do I convert these text files so as to be used in SVM? –  Nov 30 '14 at 04:42
  • @rzach I've provided a link in my answer where you can find the solution of text classification using SVM. – Irshad Bhat Nov 30 '14 at 04:57
  • @BHATIRSHAD can you please also answer this (nobody has answered it yet) ,wont take much of you time http://stackoverflow.com/questions/27201418/data-preparation-feature-selection-for-named-entity-using-svm –  Nov 30 '14 at 06:23