0

I have a set of one-hot encoded labels and I want to see how many I have of each category. Each label can contain one or more classes like this:

[1  0   0   0   0   0   0   1   0]

my first solution to the problem was to use np.argmax and np.bincount like this:

newLabels = []
for i in range(len(labels)):
    newLabels.append(np.argmax(labels[i]))

newLabels= np.asarray(newLabels)

np.bincount(newLabels)
array([1221,  722,  199,  918,  599,  678, 1569,  786,  185])

but what happens then is that the one-hot encoded example above will be given the value 0 and the second value (that should be 7) is not counted.

Does anybody have a solution to this problem?

bjornsing
  • 322
  • 6
  • 25

2 Answers2

0
from collections import Counter

newLabels = Counter()
for label in labels:
    for idx, key in enumerate(label):
         newLabels[idx]+=key

The output should be a dictionary where the keys are the labels index and the values are the count.

Tom Ron
  • 5,906
  • 3
  • 22
  • 38
0

A solution to this problem is:

np.sum(Labels, axis=0)
bjornsing
  • 322
  • 6
  • 25