-1

After training my CatBoostClassifier model I call get_proba function which returns me list of probabilities. The problem starts from an another point... I transfer that data into dataframe then to Excel after what I sum all float numbers in my list and get numbers approximately equal to 2.

(Example: 0,980831511 0,99695788 2,99173E-13 1,63919E-15 7,35072E-14 4,82846E-16 . Their sum is equal to 1,977789391 )

Parameters which were used:

'loss_function': 'MultiClassOneVsAll', 
 'eval_metric': 'ZeroOneLoss',

The problem is that I need to get dependant type of probabilities, so I get something more like: 0.2 0.5 0.1 0.2 where their sum will be equal to 1 and the highest probability (which might be obvious) is in the second category (which equals to 0.5)

1 Answers1

0

I've completed several tests.

I've used different objectives aka loss functions and metrics, so if you need to get "dependant" probability you may use everything (correct me if I'm wrong), but loss_function multiclassova (in other words OneVsAll). I've used multiclassova as eval metric and everything seemed right.

In case you use OneVsAll (using multiclassova): using multiclassova

In another case, as you see, the sum of all events equals 1, while in the last case it could vary from 0.5 to 2.0 (using other loss_function): using other loss_function

Julia Meshcheryakova
  • 3,162
  • 3
  • 22
  • 42