0

The following data is given as df:

id class country weigths
a 1 US 20
b 2 US 5
a 2 CH 5
a 1 CH 10
b 1 CH 5
c 1 US 10
b 2 GER 15
a 2 CH 5
c 1 US 15
a 1 US 10

The goal is to create an alternative allocation of the columns weight but keep the distribution of unique values in id, class and country. For example: 5 of 10 values -> 50% in column id are "a". An alternative solution for weights should keep this distribution of a = 50%. And all other distribution of each unique value in the first three columns.

For this I created the following code to get a dict with the distribution:

constraint_columns = ["id", "class", "country"]
constraints = {}

for column in constraint_columns:
    constraints[column] = dict(zip(df.groupby([column]).sum().reset_index()[column], 
df.groupby([column]).sum().reset_index()["weights"]))

The result looks as follows:

{'id': {'a': 50, 'b': 25, 'c': 25},
'class': {1: 70, 2: 30},
'country': {'CH': 25, 'GER': 15, 'US': 60}}

I then initiate the model, create the variables for the model to solve (weights) and create the constraints by looping through my constraints and map them with the variables:

model = cp_model.CpModel()
solver = cp_model.CpSolver()

count = 0
dict_weights = {}

for weight in range(len(df)):
    dict_weights[count] = model.NewIntVar(0, 100, f"weight_{count}")
    count += 1

weights_full = []

for weight in dict_weights:
    weights_full.append(dict_weights[weight])

I give a 5% range where the distribution can be different:

for constraint in constraints:
    for key in constraints[constraint]:
        keys = df.loc[df[constraint] == key].index
        model.Add(sum(list(map(dict_weights.get, keys))) >= int(constraints[constraint][key] * 1 - ((constraints[constraint][key] * 1) * 0.05)))
        model.Add(sum(list(map(dict_weights.get, keys))) <= int(constraints[constraint][key] * 1 + ((constraints[constraint][key] * 1) * 0.05)))

I solve the model and everything works fine:

solver.parameters.cp_model_presolve = False  # type: ignore
solver.parameters.max_time_in_seconds = 0.01  # type: ignore
solution_collector = VarArraySolutionCollector(weights_full)

solver.SolveWithSolutionCallback(model, solution_collector)
solution_collector.solution_list

Solution:

[0, 0, 0, 0, 8, 0, 15, 15, 23, 35]

In a next step I want to tell the model, that the result should consist out of a specific number of weights. For example: 3 - That would mean that 5 weight values should be 0 and only 3 are used to find a solution that fits the distribution. Right now it does not matter if there is a feasible solution or not.

Any ideas how to solve this?

Robert Kl
  • 1
  • 1

1 Answers1

1
  1. the solver is integral. 0.05 will be silently rounded to 0 by python.

  2. I do not understand your problem. My gut reaction is to create one bool var per weight value and per item, and convert all constraints to weighted sums of these bool variables.

Laurent Perron
  • 8,594
  • 1
  • 8
  • 22