0

I am using mlxtend EnsembleVoteClassifier to do a binary classification with prefitted linear SVC but I keep having a recurrent error :

ValueError: X.shape[1] = 352 should be equal to 336, the number of features at training time

I load prefitted classifiers into a list by using scikit-learn joblib. The classifiers are linear svc from sklearn.svm :

list of CLFS:

[SVC(C=0.1, cache_size=200, class_weight=None, coef0=0.0,decision_function_shape='ovr', degree=3, gamma='auto', kernel='linear',max_iter=-1, probability=False, random_state=None, shrinking=True,tol=0.001, verbose=False),SVC(C=0.1, cache_size=200, class_weight=None, coef0=0.0,decision_function_shape='ovr', degree=3, gamma='auto', kernel='linear',max_iter=-1, probability=False, random_state=None, shrinking=True,tol=0.001, verbose=False)]

They are passed to the ensemble vote classifier and it is fitted as usual without any issue:

ensembleVoting = EnsembleVoteClassifier(clfs = list_of_clfs, refit = False, voting='hard', weights=None)
X = ...
y = ...
ensembleVoting.fit(X,y)

the error mentionned above comes when predicting, even with the same data used for fitting:

predictions = ensembleVoting.predict(X)
compmonks
  • 647
  • 10
  • 24
  • 1
    Only reason to get this would be if one or more of your prefitted classifiers has been fitted with a different number of features. How are your list of CLFs fitted? – Ken Syme Mar 14 '18 at 12:50
  • yes, you were right. As I ma working with timeseries, I did not clip their frequencies to make sure that there would no difference in th enumber of features between the fitting and the prediction. Thank you! – compmonks Mar 14 '18 at 14:49

1 Answers1

0

As mentionned by @ken-syme in comments above, classifiers were fitted with different number features than the one for the ensemble. This happened in this case because the timeseries used as data were not sampled with exactly the same frequency.

compmonks
  • 647
  • 10
  • 24