1

For a text classification experiment, I'm trying to calculate a weighted random baseline for a class distribution. I have three labels. This is some code I found for two labels: 'm' and 'f'.

def wrb(distribution): # weighted random baseline

sum = 0
if isinstance(distribution,float):
    elem2 = 1 - distribution
    distribution = [distribution,elem2]
for prop in distribution:
    sum += prop**2
return sum
distr = labels.count('m')/len(labels)
print('WRB', wrb(distr))    

My question is which of my labels do I need to fill in, in place of the 'm' in distr = labels.count('m')/len(labels)? Is there a rule or do I literally chose 1 of my three labels randomly?

Bambi
  • 715
  • 2
  • 8
  • 19

0 Answers0