3

I'm trying to use or-tools to solve an MIP problem. It could run quite fast with small Number and Range, you may treat them as number of variables and number of constraints. But for larger values, it runs slower and sometimes even can't get the feasible results. If there any other way to improve the speed with Number= 1,000,000 and Range = 10,000?

Problem description

variables: x,r, #x=#r=Number

Object: Minimize (sum(r))

Subject to: (1) r are either 0 or 1.

(2) x are integers.

(3) x<=M*r, where M is infinity.

(4) for any unique item in buckets, sum(x[buckets==item]) falls in [count(item)*0.9,count(item)*1.1]

(5) x could have a lower bound if x >0

from __future__ import print_function
from ortools.sat.python import cp_model
import numpy as np
import pandas as pd

np.random.seed(1)


def set_up_buckets(Number,Range,mu=0,sigma=1):
    buckets = (np.random.normal(0, 1, Number)*Range).astype(int)
    print("number of buckets:",len(np.unique(buckets)))
    return buckets

model = cp_model.CpModel()
Number = 1000000
Range = 10000
Infinity = int(Number/10)
buckets = set_up_buckets(Number,Range)

def reduce_intervals(buckets,Range,sigma_cnt,mu=0,sigma=1):
    upper = (mu+sigma*sigma_cnt)*Range
    lower = (mu-sigma*sigma_cnt)*Range
    buckets_reduced = buckets[(buckets<=upper)&(buckets>=lower)]
    return buckets_reduced

buckets_reduced = reduce_intervals(buckets,Range,2)

def set_up_variables(Number):
    x,r = [],[]
    for i in range(Number):
        x.append(model.NewIntVar(0, Infinity, 'x{}'.format(i+1)))
        r.append(model.NewBoolVar('r{}'.format(i+1)))  
    return np.array(x),np.array(r)

x,r = set_up_variables(Number)

%%time
def set_up_constraints(Number,unique_buckets):
    M = Number
    x_lower_bound = int(Number/10000)
#     x_upper_bound = int(Number/10)
    print("add attributes constraints")
    for i in unique_buckets:
        xs = x[buckets==i]
        model.Add(sum(xs) <= int(1.1*len(xs)))
        model.Add(sum(xs) >= int(0.9 * len(xs)))

    print("add x&r constraints")    
    for i in range(Number):
        model.Add(x[i] <= M*r[i])
        model.Add(x_lower_bound - x[i] <= M*(1-r[i]))
    
    model.Minimize(sum(r))
    return model

model = set_up_constraints(Number,np.unique(buckets_reduced))


def solve_model(model,max_seconds = 60*60):
    solver = cp_model.CpSolver()
    solver.parameters.log_search_progress = True
    solver.parameters.num_search_workers = 8
    solver.parameters.max_time_in_seconds = max_seconds
    status = solver.Solve(model)
    return solver,status

solver,status = solve_model(model)
Han Zhu
  • 31
  • 1
  • 5
  • You could try with `num_search_workers = 8`. according to OR-Tools' developer (Laurent Perron), best diversity, thus robustness and performance for CP-SAT is hit for 8 threads. In my limited experience that's been true. See comments on this answer: https://or.stackexchange.com/a/4126/2477. Could also try if reducing the big M's values helps - that will depend on the specific instance (data). – dhasson Jul 29 '20 at 14:31
  • After 1 hour, you still haven't build the model :-) So, search parameters are irrelevant at this point. – Laurent Perron Jul 29 '20 at 15:36

1 Answers1

1

The methods set_up_constraints is catastrophic, you are scanning in a dense way O(Number ^ 2) element to build very sparse linear inequations.

You can try different code, for instance the code below does not create a majority of x[i] * 0, but is still slow.

for i in unique_buckets:
    xs = [x[j] for j in range(len(x)) if buckets[j] == i]
    model.Add(sum(xs) <= int(1.1*len(xs)))
    model.Add(sum(xs) >= int(0.9 * len(xs)))

Anyway, for the time being, the performance problem comes from your code, and not from or-tools.

Laurent Perron
  • 8,594
  • 1
  • 8
  • 22
  • Thank you Laurent. I changed this code to numpy array. Now could set up constraints in 2minutes with Number= 1,000,000 and Range = 10,000. – Han Zhu Jul 30 '20 at 07:17
  • Hi Laurent, now it could set up constraints in 2 minutes, but it consumes a lot of memories. Is there any other way to improve the performance? As you can see, I'm trying to test the limit of Ortools so I mocked these data. So we can add as much constraints as possible for an optimal or feasible solution. – Han Zhu Jul 31 '20 at 02:55