I noticed that my f-scores are slightly lower when using SK-learn's LogisticRegression classifier in conjunction with the following one-vs-rest classifier than using it by itself to do multi-class classification.
class MyOVRClassifier(sklearn.OneVsRestClassifier):
"""
This OVR classifier will always choose at least one label,
regardless of the probability
"""
def predict(self, X):
probs = self.predict_proba(X)[0]
p_max = max(probs)
return [tuple([self.classes_[i] for i, p in enumerate(probs) if p == p_max ])]
Since the documentation of the logistic regression classifier states it uses a one-vs-all strategy, I'm wondering what factors could account for the difference in performance. My one-vs-rest LR classifier seems to be over-predicting one of the classes more than the LR classifier does on its own.