Weighted random baseline for class distribution Python

Asked May 02 '17 at 19:29

Active May 02 '17 at 19:29

Viewed 621 times

For a text classification experiment, I'm trying to calculate a weighted random baseline for a class distribution. I have three labels. This is some code I found for two labels: 'm' and 'f'.

def wrb(distribution): # weighted random baseline

sum = 0
if isinstance(distribution,float):
    elem2 = 1 - distribution
    distribution = [distribution,elem2]
for prop in distribution:
    sum += prop**2
return sum
distr = labels.count('m')/len(labels)
print('WRB', wrb(distr))

My question is which of my labels do I need to fill in, in place of the 'm' in distr = labels.count('m')/len(labels)? Is there a rule or do I literally chose 1 of my three labels randomly?

asked May 02 '17 at 19:29

Bambi

Weighted random baseline for class distribution Python

0 Answers0