I have a list of lists containing KeyValuePair<int, double>
, sorted on the double, where the int
stand for the ID and the double
for the Score for that ID. I would like to create an algorithm that gives me the top-k of IDs where the sum of the IDs is maximized. Now, I know how to do this, but I don't know how to implement it in c#.
So the idea is to take the first element of each list and calculate the 'threshold', the sum of the scores of that row, which is the maximum value that the sum can be. Then we lookup all the sum of the scores of the IDs in that row, and put it in a buffer. Then we move to the next row and calculate the threshold and the scores and if one of the scores in the buffer is higher than the current threshold, we know it should be in the top-k and put it there and go on untill we have k values. I hope there is anyone that can point me in the right direction.
Some example data: There are tow lists in a list which are the following:
{(1, 25), (2, 23), (3, 19), (4, 10), (5, 3)},
{(2, 24), (3, 20), (1, 15), (5, 10), (4, 3)}
After the first round we have a threshold of 25+24 = 49
and the buffer consist of (1, 40)
and (2, 47)
. We move to the next row and find the values (2, 23)
and (3, 20)
so the threshold is 43 and we add (3, 39)
to the buffer. We see in the buffer that tuple with ID 2 has a score higher than the threshold so we add it to the Top-k.
The lists contain up to 100000 KeyValuePairs
and the point is to only process a few KeyValuePairs
to find the top-k, instead of just calculating the score for every tuple and taking the top-k.