1

I'm trying to solve a Constraint Satisfaction Optimisation Problem that assigns agents to tasks. However, different then the basic Assignment Problem, a agent can be assigned to many tasks if the tasks do not overlap. Each task has a fixed start_time and end_time. The agents are assigned to the tasks according to some unary&binary constraints.

Variables = set of tasks

Domain = set of compatible agents (for each variable)

Constraints = unary&binary

Optimisation fct = some liniar function

An example of the problem: the allocation of parking space (or teams) for trucks for which we know the arrival and departure time.

I'm interested if there is in the literature a precise name for these type of problems. I presume it is some kind of assignment problem. Also, if you ever approach the problem, how do you solve it?

Thank you.

etov
  • 2,972
  • 2
  • 22
  • 36
rad
  • 13
  • 2

2 Answers2

4

I would interpret this as: rectangular assignment-problem with conflicts which is arguably much more hard (NP-hard in general) than the polynomially-solvable assignment-problem.

The demo shown in the other answer might work and ortools' cp-sat is great, but i don't see a good reason to use discrete-time based reasoning here like it's done: interval-variables, edge-finding and co. based scheduling constraints (+ conflict-analysis / explanations). This stuff is total overkill and the overhead will be noticable. I don't see any need to reason about time, but just about time-induced conflicts.

Edit: One could label those two approaches (linked + proposed) as compact formulation and extended formulation. Extended formulations usually show stronger relaxations and better (solving) results as long as scalability is not an issue. Compact approaches might become more viable again with bigger data (bit it's hard to guess here as scheduling-propagators are not that cheap).

What i would propose:

  • (1) Formulate an integer-programming model following the basic assignment-problem formulation + adaptions to make it rectangular -> a worker is allowed to tackle multiple tasks while all tasks are tackled (one sum-equality dropped)
  • (2) Add integrality = mark variables as binary -> because the problem is not satisfying total unimodularity anymore
  • (3) Add constraints to forbid conflicts
  • (4) Add constraints: remaining stuff (e.g. compatibility)

Now this is all straightforward, but i would propose one non-naive improvement in regards to (3):

  • The conflicts can be interpreted as stable-set polytope
  • Your conflicts are induced by a-priori defined time-windows and their overlappings (as i interpret it; this is the core assumption behind this whole answer)
  • This is an interval graph (because of time-windows)
  • All interval graphs are chordal
  • Chordal graphs allow enumeration of all max-cliques in poly-time (implying there are only polynomial many)
  • The set (enumeration) of all maximal cliques define the facets of the stable-set polytope
  • Those (a constraint for each element in the set) we add as constraints!
  • (The stable-set polytope on the graph in use here would also allow very very powerful semidefinite-relaxations but it's hard to foresee in which cases this would actually help due to SDPs being much more hard to work with: warmstart within tree-search; scalability; ...)

This will lead to a poly-size integer-programming problem which should be very very good when using a good IP-solver (commercials or if open-source needed: Cbc > GLPK).

Small demo about (3)

import itertools
import networkx as nx

# data: inclusive, exclusive
# --------------------------
time_windows = [
  (2, 7),
  (0, 10),
  (6, 12),
  (12, 20),
  (8, 12),
  (16, 20)
]

# helper
# ------
def is_overlapping(a, b):
  return (b[1] > a[0] and b[0] < a[1])

# raw conflicts
# -------------
binary_conflicts = [] 
for a, b in itertools.combinations(range(len(time_windows)), 2):
  if is_overlapping(time_windows[a], time_windows[b]):
    binary_conflicts.append( (a, b) )

# conflict graph
# --------------
G = nx.Graph()
G.add_edges_from(binary_conflicts)

# maximal cliques
# ---------------
max_cliques = nx.chordal_graph_cliques(G)

print('naive constraints: raw binary conflicts')
for i in binary_conflicts:
  print('sum({}) <= 1'.format(i))

print('improved constraints: clique-constraints')
for i in max_cliques:
  print('sum({}) <= 1'.format(list(i)))

Output:

naive constraints: raw binary conflicts
sum((0, 1)) <= 1
sum((0, 2)) <= 1
sum((1, 2)) <= 1
sum((1, 4)) <= 1
sum((2, 4)) <= 1
sum((3, 5)) <= 1
improved constraints: clique-constraints
sum([1, 2, 4]) <= 1
sum([0, 1, 2]) <= 1
sum([3, 5]) <= 1

Fun facts:

  • Commercial integer-programming solvers and maybe even Cbc might even try to do the same reasoning about clique-constraints to some degree although without the assumption of chordality where it's an NP-hard problem
  • ortools' cp-sat solver has also a code-path for this (again: general NP-hard case)
    • Should trigger when expressing the conflict-based model (much harder to decide on this exploitation on general discrete-time based scheduling models)

Caveats

Implementation / Scalability

There are still open questions like:

  • duplicating max-clique constraints over each worker vs. merging them somehow
  • be more efficient/clever in finding conflicts (sorting)
  • will it scale to the data: how big will the graph be / how many conflicts and constraints from those do we need

But those things usually follow instance-statistics (aka "don't decide blindly").

sascha
  • 32,238
  • 6
  • 68
  • 110
  • This is a beautiful answer. – etov Dec 23 '20 at 13:14
  • Note that the CP-SAT solver is actually also an IP solver, and probably a good one (https://or.stackexchange.com/a/4126). Also, unless I'm missing something, there's additional logic needed on top of the overlap analysis in order to actually assign teams. I'll experiment some with using this approach instead of the overlap constraints. – etov Dec 23 '20 at 13:44
  • Imho: It uses lots of LP-tech, but behaves differently for sure. Imho, assignment-problems are very very leaning towards IP-tech and *if* ortools can compete (especially in non-tight optimization-focused instances) it's due to those LP-components, not much due to conflict-driven clause-learning, lns and co. (lots of tech in there). I suspect classic IP solvers to dominate here, but the link of you makes some strong claims and maybe someone should do a scientific eval on MIPLIB (at some time). That would be cool.The assignment is done by (1): enforce sum=1 over tasks; ignore sum=1 over workers – sascha Dec 23 '20 at 13:59
  • Thank you for the answer. Very rigorous. – rad Jan 06 '21 at 17:30
3

I don't know a name for the specific variant you're describing - maybe others would. However, this indeed seems a good fit for a CP/MIP solver; I would go with the OR-Tools CP-SAT solver, which is free, flexible and usually works well.

Here's a reference implementation with Python, assuming each vehicle requires a team assigned to it with no overlaps, and that the goal is to minimize the number of teams in use. The framework allows to directly model allowed / forbidden assignments (check out the docs)

from ortools.sat.python import cp_model
model = cp_model.CpModel()

## Data
num_vehicles = 20
max_teams = 10

# Generate some (semi-)interesting data
interval_starts = [i % 9 for i in range(num_vehicles)]
interval_len = [ (num_vehicles - i) % 6 for i in range(num_vehicles)]
interval_ends = [ interval_starts[i] + interval_len[i] for i in range(num_vehicles)]


### variables

# t, v is true iff vehicle v is served by team t
team_assignments = {(t, v): model.NewBoolVar("team_assignments_%i_%i" % (t, v)) for t in range(max_teams) for v in range(num_vehicles)}

#intervals for vehicles. Each interval can be active or non active, according to team_assignments
vehicle_intervals = {(t, v): model.NewOptionalIntervalVar(interval_starts[v], interval_len[v], interval_ends[v], team_assignments[t, v], 'vehicle_intervals_%i_%i' % (t, v)) 
                     for t in range(max_teams) for v in range(num_vehicles)}

team_in_use = [model.NewBoolVar('team_in_use_%i' % (t)) for t in range(max_teams)]

## constraints
# non overlap for each team
for t in range(max_teams):
    model.AddNoOverlap([vehicle_intervals[t, v] for v in range(num_vehicles)])
    
# each vehicle must be served by exactly one team
for v in range(num_vehicles):
    model.Add(sum(team_assignments[t, v] for t in range(max_teams)) == 1)

# what teams are in use?
for t in range(max_teams):
    model.AddMaxEquality(team_in_use[t], [team_assignments[t, v] for v in range(num_vehicles)])

#symmetry breaking - use teams in-order
for t in range(max_teams-1):
    model.AddImplication(team_in_use[t].Not(), team_in_use[t+1].Not())


# let's say that the goal is to minimize the number of teams required
model.Minimize(sum(team_in_use))

solver = cp_model.CpSolver()

# optional
# solver.parameters.log_search_progress = True     
# solver.parameters.num_search_workers = 8
# solver.parameters.max_time_in_seconds = 5

result_status = solver.Solve(model)


if (result_status == cp_model.INFEASIBLE): 
    print('No feasible solution under constraints')
elif (result_status == cp_model.OPTIMAL):
    print('Optimal result found, required teams=%i' % (solver.ObjectiveValue()))
elif (result_status == cp_model.FEASIBLE):                        
    print('Feasible (non optimal) result found')
else:
    print('No feasible solution found under constraints within time')  

# Output:
#
# Optimal result found, required teams=7        

EDIT:

@sascha suggested a beautiful approach for analyzing the (known in advance) time window overlaps, which would make this solvable as an assignment problem.

So while the formulation above might not be the optimal one for this (although it could be, depending on how the solver works), I've tried to replace the no-overlap conditions with the max-clique approach suggested - full code below.

I did some experiments with moderately large problems (100 and 300 vehicles), and it seems empirically that on smaller problems (~100) this does improve by some - about 15% on average on the time to optimal solution; but I could not find a significant improvement on the larger (~300) problems. This might be either because my formulation is not optimal; because the CP-SAT solver (which is also a good IP solver) is smart enough; or because there's something I've missed :)

Code:

(this is basically the same code from above, with the logic to support using the network approach instead of the no-overlap one copied from @sascha's answer):

from timeit import default_timer as timer
from ortools.sat.python import cp_model
model = cp_model.CpModel()

run_start_time = timer()

## Data
num_vehicles = 300
max_teams = 300

USE_MAX_CLIQUES = True

# Generate some (semi-)interesting data
interval_starts = [i % 9 for i in range(num_vehicles)]
interval_len = [ (num_vehicles - i) % 6 for i in range(num_vehicles)]
interval_ends = [ interval_starts[i] + interval_len[i] for i in range(num_vehicles)]

if (USE_MAX_CLIQUES):
    ## Max-cliques analysis
    # for the max-clique approach
    time_windows = [(interval_starts[i], interval_ends[i]) for i in range(num_vehicles)]

    def is_overlapping(a, b):
      return (b[1] > a[0] and b[0] < a[1])

    # raw conflicts
    # -------------
    binary_conflicts = [] 
    for a, b in itertools.combinations(range(len(time_windows)), 2):
      if is_overlapping(time_windows[a], time_windows[b]):
        binary_conflicts.append( (a, b) )

    # conflict graph
    # --------------
    G = nx.Graph()
    G.add_edges_from(binary_conflicts)

    # maximal cliques
    # ---------------
    max_cliques = nx.chordal_graph_cliques(G)

##

### variables

# t, v is true iff point vehicle v is served by team t
team_assignments = {(t, v): model.NewBoolVar("team_assignments_%i_%i" % (t, v)) for t in range(max_teams) for v in range(num_vehicles)}

#intervals for vehicles. Each interval can be active or non active, according to team_assignments
vehicle_intervals = {(t, v): model.NewOptionalIntervalVar(interval_starts[v], interval_len[v], interval_ends[v], team_assignments[t, v], 'vehicle_intervals_%i_%i' % (t, v)) 
                     for t in range(max_teams) for v in range(num_vehicles)}

team_in_use = [model.NewBoolVar('team_in_use_%i' % (t)) for t in range(max_teams)]

## constraints
# non overlap for each team
if (USE_MAX_CLIQUES):
    overlap_constraints = [list(l) for l in max_cliques]
    for t in range(max_teams):
        for l in overlap_constraints:
            model.Add(sum(team_assignments[t, v] for v in l) <= 1)
else:        
    for t in range(max_teams):
        model.AddNoOverlap([vehicle_intervals[t, v] for v in range(num_vehicles)])
        

    
# each vehicle must be served by exactly one team
for v in range(num_vehicles):
    model.Add(sum(team_assignments[t, v] for t in range(max_teams)) == 1)

# what teams are in use?
for t in range(max_teams):
    model.AddMaxEquality(team_in_use[t], [team_assignments[t, v] for v in range(num_vehicles)])

#symmetry breaking - use teams in-order
for t in range(max_teams-1):
    model.AddImplication(team_in_use[t].Not(), team_in_use[t+1].Not())


# let's say that the goal is to minimize the number of teams required
model.Minimize(sum(team_in_use))

solver = cp_model.CpSolver()

# optional
solver.parameters.log_search_progress = True     
solver.parameters.num_search_workers = 8
solver.parameters.max_time_in_seconds = 120

result_status = solver.Solve(model)


if (result_status == cp_model.INFEASIBLE): 
    print('No feasible solution under constraints')
elif (result_status == cp_model.OPTIMAL):
    print('Optimal result found, required teams=%i' % (solver.ObjectiveValue()))
elif (result_status == cp_model.FEASIBLE):                        
    print('Feasible (non optimal) result found, required teams=%i' % (solver.ObjectiveValue()))
else:
    print('No feasible solution found under constraints within time')  
    
print('run time: %.2f sec ' % (timer() - run_start_time))
etov
  • 2,972
  • 2
  • 22
  • 36
  • Thank you very much for your answer. Very helpful code. I will take some time to digest it. – rad Jan 06 '21 at 17:21