0

I am using a multi label logistic regression classifier using the OneVsRestClassifer wrapper, however I'm facing a problem where for some observations it doesn't return any labels and the predict_proba function returns all probabilities very close to zero even though I know these examples belong to at least one class.

Is there any way of calibrating a multi label classifier like this so that it doesn't return no labels?

UPDATE #1 The code I'm using at the moment to fit the classifier and retrieve the probabilities:

#Fit the classifier
clf = LogisticRegression(C=1., solver='lbfgs')
clf = OneVsRestClassifier(clf)
mlb = MultiLabelBinarizer()
mlb = mlb.fit(train_labels)
train_labels = mlb.transform(train_labels)
clf.fit(train_profiles, train_labels)

#Predict probabilities:
probas = clf.predict_proba([x_test])

To give a bit of background, the classifier is trained and tested on numerical vector profiles for a corpus of texts. These profiles are retrieved after applying a dimensionality reduction algorithm (SVD). I was wondering if maybe any additional normalization would be necessary but was also expecting that the multi-label classifier would always return some labels without any additional pre-processing of the profiles.

ce57
  • 128
  • 8

0 Answers0