Using Python Ray With CPLEX Model Object

Question

I am trying to parallelize an interaction with a Python object that is computationally expensive. I would like to use Ray to do this but so far my best efforts have failed.

The object is a CPLEX model object and I'm trying to add a set of constraints for a list of conditions.

Here's my setup:

import numpy as np
import docplex.mp.model as cpx
import ray

m = cpx.Model(name="mymodel")

def mask_array(arr, mask_val):
    array_mask = np.argwhere(arr == mask_val)
    arg_slice = [i[0] for i in array_mask]
    return arg_slice

weeks = [1,3,7,8,9]
const = 1.5
r = rate = np.array(df['r'].tolist(), dtype=np.float)
x1 = m.integer_var_list(data_indices, lb=lower_bound, ub=upper_bound)
x2 = m.dot(x1, r)

@ray.remote
def add_model_constraint(m, x2, x2sum, const):
    m.add_constraint(x2sum <= x2*const)
    return m

x2sums = []
for w in weeks:
    arg_slice = mask_array(x2, w)
    x2sum = m.dot([x2[i] for i in arg_slice], r[arg_slice])
    x2sums.append(x2sum)

#: this is the expensive part 
for x2sum in x2sums:
    add_model_constraint.remote(m, x2, x2sum, const)

In a nutshell, what I'm doing is creating a model object, some variables, and then looping over a set of weeks in order to build a constraint. I subset my variable, compute some dot products and apply the constraint. I would like to be able to create the constraint in parallel because it takes a while but so far my code just hangs and I'm not sure why.

I don't know if I should return the model object in my function because by default the m.add_constraint method modifies the object in place. But at the same time I know Ray returns references to the remote value so yea, not sure what's supposed to happen there.

Is this at all a valid use of ray? It it reasonable to expect to be able to modify a CPLEX object in this way (or any other arbitrary python object)?

I am new to Ray so I may be structuring this all wrong, or maybe this will never work for X, Y, and Z reason which would also be good to know.

I have never heard of or used `ray` before, but your code snippet appears to be incomplete. First, there is no `import ray`. Second, according to the quick start documentation [here](https://github.com/ray-project/ray), don't you need a `ray.get()` call? Finally, I wonder if there is a way for you to create the constraint expressions in parallel and then just call `m.add_constraints` once (in a batch)? — rkersh, Sep 19 '19 at 16:39
I just updated the code snippet to include the `import ray` statement. I'm not sure how the `ray.get()` command comes into play when you have N threads operating on a shared object. The last suggestion is a good one. I'll try it out ASAP. — Sledge, Sep 19 '19 at 16:55

score 0 · Answer 1 · answered Feb 28 '20 at 13:18

The Model object is not designed to be used in parallel. You cannot add constraints from multiple threads at the same time. This will result in undefined behavior. You will have to at least a lock to make sure only thread at a time adds constraints.

Note that parallel model building may not be a good idea at all: the order of constraints will be more or less random. On the other hand, behavior of the solver may depend on the order of constraints (this is called performance variability). So you may have a hard time reproducing certain results/behavior.

score 0 · Answer 2 · answered Mar 01 '20 at 16:07

I understand the primary issue was the performance of module building. From the code you sent, I have two suggestions to address this:

post constraints in batches, that is store constraints in a list and add them once using Model.add_constraints(), this should be more efficient than adding them one at a time.
experiment with Model.dotf() (functional-style scalar product). It avoids building auxiliary lists, passing instead a function of the key , returning the coefficient. This method is new in Docplex version 2.12. For example, assuming a list of 3 variables:

abc = m.integer_var_list(3, name=["a", "b", "c"]) m.dotf(abc, lambda k: k+2)

docplex.mp.LinearExpression(a+2b+3c)

Model.dotf() is usually faster than Model.dot()

Using Python Ray With CPLEX Model Object

2 Answers2