4

How can I assign "custom" prior probabilities to the Bayes rule in the naive Bayes classifier in scikit?

For simplicity, let's take the Iris dataset for example, where we have 150 samples and 3 different classes with 50 samples per class. I assume that the default will assign prior probabilities of p(c_i) = ~0.33 based on the input data (and depending on how the training dataset was resampled).
But what if I have some additional knowledge and know that the flower class 1 occurs much more often in "reality", so that the priors for the different classes would be

p(c=1) = 0.8
p(c=2) = 0.1
p(c=3) = 0.1

Let's assume I have done all the preprocessing (feat. selection, normalization/standardization, dim reduction. etc.) and would use the Naive Bayes classifier (Gaussian) like this:

from sklearn.naive_bayes import GaussianNB

gnb_clf = GaussianNB()
gnb_clf.fit(X_train, y_train)
pred_test = gnb_clf.predict(X_test)

How would I assign my "custom prior probabilities"?

I see that there is a set_params parameter for the GaussianNB (see the documentation), however, I am not sure how to use it...

0 Answers0