MultinomialNB or GaussianNB or CategoricalNB what to use here?

Question

Let I have a input feature X = {X1, X2}. Where X1 is real-valued (also consider it follows Gaussian Dist) but X2 is a categorical feature. Now if I want to use the Naive Bayes algorithm. Which one I should use? Another way Does GaussianNB works perfect in Categorical features?

I know it's not popular, but why not implement a mixed NB yourself? The likelihood functions are readily available. — dedObed, Jan 09 '21 at 15:42

score 1 · Answer 1 · answered Dec 28 '21 at 02:01

Each algorithm of NB expects different types of data.

GaussianNB → When you have continuous features.
CategoricalNB → When you have categorical data.
MultinomialNB → Applied to text data.

So, given your data has continuous features, categorical features, and text data what algorithm you will use? The fundamental assumption of each and every algorithm is that it assumes the features are conditionally independent. Fit the categorical features on categoricalNB, continuous features on GaussianNB and text data on MultinomialNB and get the likelihood probabilities of each model(For each data point now we will have 3 likelihood probabilities) and multiply them to get the overall likelihood probability.

Note: You have to multiply the prior probability to the final likelihood probability to get the final posterior probability

Get more depth from here

score 0 · Accepted Answer · answered Jan 08 '21 at 12:58

Transform your categorial feature X2 using get_dummies() (pandas library). And then train the model.

I recommend first try GaussianNB, an evaluate his accuracy. Then try others Naïve Bayes models that sklearn has. Without seeing the data (even having it) is quiet difficult to predict which model works betters in each case. Evaluate each one.

MultinomialNB or GaussianNB or CategoricalNB what to use here?

2 Answers2