I have a question regarding the Naïve Bayes classification method. I ran though what I thought was an easy example but ran into a snag.
Basically here is the classification I would like to do:
I want to be able to take some training data:
input1 | input2 | input3 | class
1 3 3 1
2 1 1 2
1 1 1 3
3 3 3 1
and classify them into a class 1-3.
As I understand it first you compute the prior probability of the class so in this case that would be
class 1 = P(c_1) = 0.50
class 2 = P(c_2) = 0.25
class 3 = P(c_3) = 0.25
which thusfar makes perfect sense. They all add to 1 and its very easy to see where those numbers come from.
So due to the numerical nature of those values I wanted to simplify them into ranges. So I reconstructed my data into this:
So anyways that how I got to that table. Continuing with the Bayes part:
P(Class 1 | avg_speed_1): 0.5
P(Class 1 | avg_speed_2): 0
P(Class 1 | avg_speed_3): 0
P(Class 2 | avg_speed_1): 0
P(Class 2 | avg_speed_2): 0.25
P(Class 2 | avg_speed_3): 0
P(Class 3 | avg_speed_1): 0
P(Class 3 | avg_speed_2): 0
P(Class 3 | avg_speed_3): 0.25
P(Class 1 | avg_distance_1): 0.5
P(Class 1 | avg_distance_2): 0
P(Class 1 | avg_distance_3): 0
P(Class 2 | avg_distance_1): 0
P(Class 2 | avg_distance_2): 0.25
P(Class 2 | avg_distance_3): 0
P(Class 3 | avg_distance_1): 0
P(Class 3 | avg_distance_2): 0
P(Class 3 | avg_distance_3): 0.25
P(Class 1 | avg_elev_gain_1): 0.5
P(Class 1 | avg_elev_gain_2): 0
P(Class 1 | avg_elev_gain_3): 0
P(Class 2 | avg_elev_gain_1): 0
P(Class 2 | avg_elev_gain_2): 0
P(Class 2 | avg_elev_gain_3): 0
P(Class 3 | avg_elev_gain_1): 0
P(Class 3 | avg_elev_gain_2): 0
P(Class 3 | avg_elev_gain_3): 0.5
now this all still makes sense to me. each class still adds to 1 however when I go to compute the probability for each class, the 0's screw up the calculation
take the first class for example:
P(Class 1 | avg_speed_1) *
P(Class 1 | avg_speed_2) *
P(Class 1 | avg_speed_3) *
P(Class 1 | avg_distance_1) *
P(Class 1 | avg_distance_2) *
P(Class 1 | avg_distance_3) *
P(Class 1 | avg_elev_gain_1) *
P(Class 1 | avg_elev_gain_2) *
P(Class 1 | avg_elev_gain_3) *
P(Class 1) = 0
I've found that this always equals zero because there are a number of input elements that still zero! Where did I go wrong?!? Does this mean that I have insufficient training data?
That being said is the Naïve Bayes approach even the right way to approach this classification?
Any thoughts would be greatly appreciated