1

Example: I have the following matrix:

[
    [1, 2, 4],
    [5, 2, 7],
    [2, 3, 3],
]

The maximum sum would be 12 (5 + 3 + 4) with the index-tuples (1, 0); (2, 1); (0, 2).

I can only think about a brute-force-solution, looping through all of the possible index-permutations calling numpy.choose with the indices and calculating the sum from the selected values. Is there a more elegant solution, maybe even a function for that in numpy or other?

MangoNrFive
  • 1,541
  • 1
  • 10
  • 23

2 Answers2

1

This problem can be seen as an instance of the assignment problem. In assignment problem lingo the rows will be the workers, and the tasks the columns, and you need to assign one and only one worker per task minimising the overall cost.

This problem can be solved using the Hungarian Algorithm, which admits a bipartite graph formulation, therefore we can use nx.bipartite.minimum_weight_full_matching as below:

import numpy as np
import networkx as nx

arr = np.array([[1, 2, 4],
                [5, 2, 7],
                [2, 3, 3], ])

flat_values = arr.flatten()
indices = [(f"r{i}", f"c{j}") for i in range(3) for j in range(3)]

G = nx.Graph()
for (start, end), length in zip(indices, flat_values):
    G.add_edge(start, end, length=-1*length)

match = nx.bipartite.minimum_weight_full_matching(G, weight="length")

res = sorted((k, v) for k, v in match.items() if k.startswith("r"))
print(res)  

Output

[('r0', 'c2'), ('r1', 'c0'), ('r2', 'c1')]

Here r stands for row and c for column

Dani Mesejo
  • 61,499
  • 6
  • 49
  • 76
0

If I'm understanding right, you want a way to calculate the maximum values in each row or column, and get their locations, without having to manually loop over each value?

If that's the case, you could use numpy.amax (link) to get the maximum values in each column or row, and numpy.argmax (link) to get the locations of these values.

Mouse
  • 395
  • 1
  • 7
  • No, I want to find the values with the maximum sum while selecting only one in each row and column. See that 7 isn't in the selected values, as the maximum possible sum would be only 11 when selecting 7 (7 + 2 + 2 ) or (7 + 3 + 1). So the maximum value in a row or column isn't necessarily one of the values that maximize the sum (that's why I chose this example). – MangoNrFive Aug 04 '22 at 10:42
  • Oh I see, so choosing 5 means you cannot select anything from row 1 or column 0? Hmm in that case I'm not sure you can do it simply with any numpy function that I know of. This screams recursion to me -- perhaps there is a greedy optimal solution? – Mouse Aug 04 '22 at 10:45