-1

I am looking for a way to minimize costs with multiple constraints by selecting one option per group out of a large dataset with groups that each contain options. The dataset consists of around 100 groups with 200 options each.

Below is a list of the conditions for the optimization, an example of the data and what the result should be. On small datasets I just looped over all combinations but with the actual large dataset this would take forever. I looked into SciPy Optimize Minimize but that doesn't seem suitable. Is there an existing optimization framework that would be usable to find the lowest costs? If not, what would be a good way to solve it in Python?

Conditions

  • Exactly one option per group
  • Minimize sum of costs
  • Sum of A must be lower than 10
  • Sum of B must be lower than 12

Dataset

+-------+--------+-----+-----+-------+
| Group | Option |  A  |  B  | Costs |
+-------+--------+-----+-----+-------+
|     1 |      1 |  10 |   0 |    10 |
|     1 |      2 |   0 |   0 |    21 |
|     1 |      3 |   0 |   7 |    15 |
|     2 |      1 |   8 |   0 |     8 |
|     2 |      2 |   0 |   0 |    34 |
|     2 |      3 |   0 |   5 |    18 |
|     3 |      1 |   9 |   0 |     9 |
|     3 |      2 |   0 |   0 |    20 |
|     3 |      3 |   0 |   6 |     7 |
+-------+--------+-----+-----+-------+

Result

+-------+--------+
| Group | Option |
+-------+--------+
|     1 |      1 |
|     2 |      3 |
|     3 |      3 |
+-------+--------+
Total costs: 35
Sum A: 10
Sum B: 11
atoomkern
  • 5
  • 2
  • Is your python using SQL in any way? – JacobIRR Sep 12 '17 at 18:53
  • Welcome to StackOverflow. Please read [How to Ask](https://stackoverflow.com/help/how-to-ask), and include details of what you have tried, and be more specific about what your question/problem is. – Antimony Sep 12 '17 at 18:55
  • @JacobIRR Yes, the dataset is stored in a MariaDB database. – atoomkern Sep 12 '17 at 18:57
  • @Antimony thank you for the suggestions. I have editted my post and I hope it is more clear now. – atoomkern Sep 12 '17 at 19:06
  • @roganjosh I added my questions and I will look into combinatorial optimisation. Thank you for the suggestion. – atoomkern Sep 12 '17 at 19:07
  • @atoomkern I'm not aware of a maintained python library that is suitable for what you are trying to do. It sounds like you want to use metaheuristics for this but I have always implemented them myself from scratch. – roganjosh Sep 12 '17 at 19:09
  • I'm unclear how you got the result you did. In general, this sounds like a linear programming problem with a possible groupby before-hand. [Pulp](http://pythonhosted.org/PuLP/) is a library you could use but it has quite a steep learning curve IMO. – Jarad Sep 12 '17 at 21:03

1 Answers1

0

Here is a solution using CVXPY (http://www.cvxpy.org/en/latest/index.html). I arranged the data such that A, B, cost are matrices and column i represents the options for group i.

import cvxpy as cvx
import numpy as np

A = np.array([[10,8,9],[0,0,0],[0,0,0]])
B = np.array([[0,0,0],[0,0,0],[7,5,6]]) 
cost = np.array([[10,21,15],[8,34,18],[9,20,7]]).T

choices = cvx.Int(3,3)
constraints = [cvx.sum_entries(choices[:,i]) == 1 for i in range(3)] + [cvx.sum_entries(cvx.mul_elemwise(A,choices)) <= 10] + [cvx.sum_entries(cvx.mul_elemwise(B,choices)) <= 12] + [choices >= 0, choices <= 1]
objective = cvx.Minimize(cvx.sum_entries(cvx.mul_elemwise(cost,choices)))
prob = cvx.Problem(objective,constraints)
prob.solve()

print(int(round(prob.value)))
print(np.matrix(prob.variables()[0].value.round(),dtype = np.int))

For some reason the output values are still float, so I cast them back to integer.

Hennich
  • 682
  • 3
  • 18
  • Thank you for the detailed reply! It works very well. For the large dataset it takes about 50 seconds with the GLPK_MI solver. I am still figuring out how to speed this up. – atoomkern Sep 20 '17 at 05:47