I ran a h2o gradient boosting classifier model to predict probabilities for three classes 0,1 and 2. There is a heavy class imbalance (93:5:2) in the training data.
Although the individual classes 1 & 2 are not correctly predicted in confusion matrix (as expected), the AUC is decent for these classes individually.
I plan to manually predict the final classes
My understanding is that the resulting probabilities (P0,P1 & P2) are calibrated and sum up to 1.
Since multinomial model in h2o is essentially one vs many approach, but the scores are summing up to 1, is it correct to add or compare probabilities?
So if P0 = 0.40 , P1 =0.35 and P2=0.25, the predicted class will be 0 (based on max probability)
Does this mean P(1,2) = 0.6 Or p(not 0) = 0.6? (Since the model for class 0 is actually 0 against all other classes)
Can I then compare the probabilities of 1&2 and say P1 (0.35) > P2 (0.25), so the predicted class should be 1? (Since the resulting classes are mutually exclusive and probabilities add up to 1, will these be comparable?)