Discrete optimization - selecting exactly N items from each row and column of a score matrix

Question

Given a matrix of scores, I'd like to select exactly n elements from each column and each row such that the total score of the selected elements across the entire matrix will be as high as possible.

Example: given the cost matrix

array([[0.65500799, 0.79214695, 0.39854742],
       [0.53634974, 0.3682463 , 0.99663978],
       [0.73423119, 0.87150676, 0.80823699]])

The optimal selection for n=1 is:

array([[1., 0., 0.],
       [0., 0., 1.],
       [0., 1., 0.]])

the total score of this solution is 0.65500799+0.87150676+0.99663978

The optimal selection for n=2 is:

array([[1., 1., 0.],
       [1., 0., 1.],
       [0., 1., 1.]])

the total score of this solution is 0.65500799+0.53634974+0.79214695+0.87150676+0.99663978+0.80823699

These solutions were obtained by a naive Breadth-First Search (BFS). However, this approach isn't computationally feasible (run time explodes) for larger problems (e.g., 10x10, n=2).

Questions:

How is this discrete optimization problem classified?
What heuristics can allow finding good solutions for this problem quickly?
What Python libararies implement these heuristics?

Try at https://stats.stackexchange.com/, maybe they got some ideas. — Joe, May 23 '19 at 07:50
This can be formulated as an integer programming problem and solved quickly using off the shelf IP solvers. If you are open to this approach, I can provide more info. — LarrySnyder610, May 24 '19 at 11:35

score 2 · Accepted Answer · answered May 24 '19 at 13:10

Here is a solution based on integer programming (IP).

Decision variables: x[i,j] = 1 if we select the item in row i, column j.

Parameters (inputs): s[i,j] = score for entry (i, j)

Formulation:

maximize sum {i, j} s[i,j] * x[i,j]
subject to sum {i} x[i,j] = n     for all j
           sum {j} x[i,j] = n     for all i
           x[i,j] in {0,1}        for all i, j

You can implement this in Python/PuLP or a solver-specific package such as gurobipy or docplex. I would expect that these solvers can solve even moderately large instances of the problem, to optimality (not heuristically), within a fraction of a second.

Following your answer, I found a docplex example code for a very similar problem - https://github.com/IBMDecisionOptimization/docplex-examples/blob/master/examples/mp/jupyter/boxes.ipynb - works like a charm. — Trisoloriansunscreen, May 24 '19 at 16:14

Discrete optimization - selecting exactly N items from each row and column of a score matrix

1 Answers1