Why does multiclass Logistic Regression give different results than choosing the most probable label in a OvR classifier?

Question

I noticed that my f-scores are slightly lower when using SK-learn's LogisticRegression classifier in conjunction with the following one-vs-rest classifier than using it by itself to do multi-class classification.

class MyOVRClassifier(sklearn.OneVsRestClassifier):
    """
    This OVR classifier will always choose at least one label,
    regardless of the probability
    """
    def predict(self, X):
        probs = self.predict_proba(X)[0]
        p_max = max(probs)
        return [tuple([self.classes_[i] for i, p in enumerate(probs) if p == p_max ])]

Since the documentation of the logistic regression classifier states it uses a one-vs-all strategy, I'm wondering what factors could account for the difference in performance. My one-vs-rest LR classifier seems to be over-predicting one of the classes more than the LR classifier does on its own.

The LR estimator uses some randomness when learning. Its random seed can be set explicitly. — Fred Foo, Apr 28 '14 at 10:21

score 1 · Answer 1 · answered Apr 28 '14 at 04:31

Just guessing, but probably when "no one votes" you get many tinny floating point values, and with LR you end up underflowing to zero. So instead of picking the person most confident / closest, you end up picking based on tie-breaking zero. See an example here of the difference.

Why does multiclass Logistic Regression give different results than choosing the most probable label in a OvR classifier?

1 Answers1