7

I am using sklearn for python to perform cross validation using SVMs. I tried with the linear and rbf kernels and it all works fine. When i run it with the polynomial kernel though it never finishes. It has been running for 8 hours and still nothing. The dimensionality of the input X is (1422, 2)

def SupportVectorMachines(X,y):
     clf = svm.SVC(C=1.0, kernel='poly', degree=3, gamma=2)
     classifier = clf.fit(X,y)
     score = cross_validation.cross_val_score(classifier, X,y, cv=10, n_jobs=1).mean()
     return score

Any ideas why is that?

Thanks

user1663930
  • 295
  • 2
  • 5
  • 12
  • Did you standardize the inputs? SVMs can be very picky about that, and the poly kernel in particular has numerical stability problems. – Fred Foo Mar 24 '14 at 09:07
  • Yes i did. Still doesn't work. I tried with standardized inputs and non-standardized inputs – user1663930 Mar 24 '14 at 10:00
  • Hm. Well, SVM training can take cubic time in the worst case. Have you tried setting `verbose=2` on `cross_val_score` to see if it can at least train one SVM in 8 hours? – Fred Foo Mar 24 '14 at 10:24
  • still stuck. didn't print anything – user1663930 Mar 25 '14 at 11:15
  • I'm afraid I'm out of ideas; I never use kernel SVMs because their training time is so hard to estimate (although on a 1422×2 dataset, 8 hours is pretty extreme). – Fred Foo Mar 25 '14 at 13:16
  • Were you able to solve that problem? – AturSams Apr 21 '19 at 11:39
  • Is it possible that you put data in with the transpose of what you want? Do you want 2 datapoints with 1422 dimensions, or 1422 datapoints with 2 dimensions? – D A May 11 '20 at 22:02

2 Answers2

2

Try reducing C (try C= 0.001,0.01,0.1). C is the penalty parameter and as C gets bigger, the model tries to reduce the penalty, and so takes more time to train.

Or, try reducing the number of cross validation folds. Since the dataset consists of only 1422 points, try using cv=5. This will take a smaller running time.

1

Try setting (max_iter = 1e5).

Something like:

    clf = svm.SVC(C=1.0, kernel='poly', degree=3, gamma=2,max_iter = 1e5)

It gives the following error, but terminates:

    /usr/local/lib/python3.6/dist-packages/sklearn/svm/_base.py:231: ConvergenceWarning: Solver terminated early (max_iter=100000).  Consider pre-processing your data with StandardScaler or MinMaxScaler.  % self.max_iter, ConvergenceWarning)