How to speed up graph coloring problem in python PuLP

Question

I am trying to solve the classic graph coloring problem using python PuLP. We have n nodes, a collection of edges in the form edges = [(node1, node2), (node2, node4), ...], and we are trying to find the minimum number of node colors so that no connected nodes share a color.

My implementation works, but is slow. It is made of three constraints, plus the one optimization of initializing node0 to color 0 to somewhat limit the search space. The code is as follows:

    nodes = range(node_count)
    n_colors = 10
    # colors = range(node_count)
    colors = range(n_colors)

    prob = LpProblem("coloring", LpMinimize)
    # variable xnc shows if node n has color c
    xnc = LpVariable.dicts("x", (nodes, colors), cat='Binary')
    # array of colors to indicate which ones were used
    used_colors = LpVariable.dicts("used", colors, cat='Binary')

    # minimize how many colors are used, and minimize int value for those colors
    prob += lpSum([used_colors[c] * c for c in colors])
    # prob += lpSum([used_colors[c] for c in colors])

    # set the first node to color 0 to constrain starting point
    prob += xnc[0][0] == 1

    # Every node uses one color
    for n in nodes:
        prob += lpSum([xnc[n][c] for c in colors]) == 1

    # Any connected nodes have different colors
    for e in edges:
        e1, e2 = e[0], e[1]
        for c in colors:
            prob += xnc[e1][c] + xnc[e2][c] <= 1

    # mark color as used if node has that color
    for n in nodes:
        for c in colors:
            prob += xnc[n][c] <= used_colors[c]

    prob.solve()

I see that there are symmetries, and I know I could reduce this by making any new color used at most max(colors_already_used) + 1, so that if node 0 is color 0, node 1 will either have the same color, or color 1. But I am not sure how to encode this because max is not allowed the linear nature of the problem in PuLP as far as I know. I achieve a similar effect above by multiplying all colors used by their integer values, which speeds things up a bit but I do not think works as quite the efficient/deterministic constraint I seek.

Also limiting the number of colors seems to have a nice effect on the speed, but I am not sure if it is worth the preprocessing cost to try and find a heuristic before starting the optimization, since it is not clear how many colors will be needed in advance.

What other constraints could I add, or other ways I could speed it up? I am mostly interested in better ways to formulate the problem, but also open to computational optimizations ie parallelization, if they can be done in PuLP.

There are three possible areas where PuLP may be slow: (1) PuLP model generation (2) communication between PuLP and the solver and (3) solution time in the solver. For MIP models it is usually (3). You may want to try alternative solvers with PuLP or write out an MPS file and submit to a few solvers at [NEOS](https://neos-server.org/neos/index.html). — Erwin Kalvelagen, Nov 27 '20 at 00:55
Instead of doing the pairs, you could try finding the maximal cliques in your problem and creating stronger constraints on those cliques. https://en.wikipedia.org/wiki/Clique_(graph_theory) — pchtsp, Nov 27 '20 at 20:14

How to speed up graph coloring problem in python PuLP

0 Answers0