0

I am working with large matrices of class IDs and predicted probabilities which I want to average. I then want to return the 3 classes in each row with the highest probabilities.

The problem is, the classes in each row vary. What is the most efficient way to implement this?

Here is a toy example using just one row:

a = [11, 12, 13]
a_probs = [0.2,  0.1, 0.02]

b = [8, 11, 15]
b_probs = [0.05, 0.4, 0.12]

So, in this example, only class 11 occurs in both matrices. So, the averaged probability for each class is:

[8, 11, 12, 13, 15] (0.05+0)/2 (0.2+0.4)/2 0.1+0/2 0.02+0/2 0.12+0/2

My current method is very slow: concatenate the classes for one row across all matrices, unique, locate and sum the probs for each class, average.

Chris Parry
  • 2,937
  • 7
  • 30
  • 71
  • you want an average for every 2 rows? Or you want an average for everything (i.e. the whole large matrix) – GameOfThrows Jun 16 '16 at 10:36
  • It can probably be done, but you need to post a more complete toy example – Luis Mendo Jun 16 '16 at 10:43
  • Is the class ID always an Integer and are you aware of the amount of classes before the calculation starts? – Finn Jun 16 '16 at 10:55
  • could you please add your code ? maybe that could be optimized. also how much of those rows are there and are you aware of the amount beforehand? – Finn Jun 16 '16 at 11:03

0 Answers0