0

I have an sklearn pipeline consisting of a TfidfVectorizer and a SGDClassifier(loss='log'), fitted on a multi-label training dataset. When I then use ELI5 explain_prediction on a sample (using the vectorizer and classifier from pipeline), I get different probabilities than when I use pipeline.predict_proba on the same sample. Why is this?

Kim Tang
  • 2,330
  • 2
  • 9
  • 34
Dennis
  • 171
  • 9
  • 1
    Can you please provide a [minimal, reproducible example](https://stackoverflow.com/help/minimal-reproducible-example) of your code? – Kim Tang Oct 15 '20 at 14:02
  • Have you set `random_state` to the same values? Can you share the relevant part of your eli5 and sklearn model initialisation? – Prayson W. Daniel Oct 16 '20 at 05:56

0 Answers0