Multinomial naive bayes classifier returns different results on same data

Asked May 25 '22 at 09:58

Active May 25 '22 at 10:20

Viewed 212 times

I've been using MultinomialNB classifier from sklearn.naive_bayes library on vectorized text data.

make_pipeline(
            TfidfVectorizer(),
            MultinomialNB(
                alpha=0.1,
                fit_prior=True,
                class_prior=self.class_prior
            ),
        )

After some data preprocessing (lower text, stemming, stopwords removal,...) and training I'm saving the trained model into model.sav file for later prediction. I tried to train two models with same training data, saved in two different files. How is it possible, that the prediction results and probabilities for same input are different?

model 1 prediction

{
    "class": 2,
    "probability": 0.554312
}

model 2 prediction

{
    "class": 1,
    "probability": 0.530134
}

As I know, the MultinomialNB does not use any random state or seed or something, that could cause this, or does it?

edited May 25 '22 at 10:20

asked May 25 '22 at 09:58

hoodie_hxe

does it stay consistent if you fix a random seed? – LukasNeugebauer May 25 '22 at 10:32
Thanks for your reaction. What random seed do you mean, please? I didn't find any seed parameter in documentation here https://scikit-learn.org/stable/modules/generated/sklearn.naive_bayes.MultinomialNB.html – hoodie_hxe May 25 '22 at 11:23
I've tried to add random.seed() before whole classification, and it did not help. – hoodie_hxe May 25 '22 at 14:22

Multinomial naive bayes classifier returns different results on same data

0 Answers0