3

I am using PuLP with Python to do some optimization and because my data is so large, I decided to try multithreading as my problem is quite large, i.e. choosing

However, in testing with a small subset of the main problem (10k instead of 1M people), I cannot get multithreading to actually use multiple threads.

I followed the instructions to build the solver from source using the ../configure --enable-cbc-parallel flag as described on Coin-OR website; everything worked great and all the tests passed. I checked the CBC config log in build/Cbc/config.log and it has the message configure:30105: Cbc multithreading enabled in line 845, so it definitely is enabled to work.

System:

  • Mac OS X 10.14.3
  • i7-4870HQ quad-core
  • Python 3.6.7 w/Anaconda
  • problem occurs both in Jupyter and running in Python interpreter from the command line

Code, similar to from example here:

start = time.time()
solver = solvers.COIN_CMD(~/Cbc-2.9/build/Cbc/src/cbc',threads=8,msg=1,fracGap = 0.01)
prob.solve(solver)
print('time to solve:',time.time()-start,'seconds')

>> time to solve: 24.815305948257446 seconds

That time was about the same if I specified the multi-threaded solver or if I just used the default solver.

In the CBC message at runtime was the line:

threads was changed from 0 to 8

and also the lines:

Cbc0012I Integer solution of -25507 found by DiveCoefficient after 0 iterations and 0 nodes (18.04 seconds)
Cbc0030I Thread 0 used 0 times,  waiting to start 0.291008, 0 cpu time, 0 locks, 0 locked, 0 waiting for locks
Cbc0030I Thread 1 used 0 times,  waiting to start 0.24997687, 0 cpu time, 0 locks, 0 locked, 0 waiting for locks
Cbc0030I Thread 2 used 0 times,  waiting to start 0.21034408, 0 cpu time, 0 locks, 0 locked, 0 waiting for locks
Cbc0030I Thread 3 used 0 times,  waiting to start 0.17122722, 0 cpu time, 0 locks, 0 locked, 0 waiting for locks
Cbc0030I Thread 4 used 0 times,  waiting to start 0.13530493, 0 cpu time, 0 locks, 0 locked, 0 waiting for locks
Cbc0030I Thread 5 used 0 times,  waiting to start 0.098966837, 0 cpu time, 0 locks, 0 locked, 0 waiting for locks
Cbc0030I Thread 6 used 0 times,  waiting to start 0.062871933, 0 cpu time, 0 locks, 0 locked, 0 waiting for locks
Cbc0030I Thread 7 used 0 times,  waiting to start 0.028151035, 0 cpu time, 0 locks, 0 locked, 0 waiting for locks
Cbc0030I Main thread 0 waiting for threads,  1 locks, 0.00077700615 locked, 9.5367432e-07 waiting for locks
Cbc0001I Search completed - best objective -25507, took 0 iterations and 0 nodes (18.29 seconds)

which means that all the threads were created, but not used?

One solution that I've thought of but don't know how to solve: maybe my path to the solver is wrong, i.e. the COIN_CMD solver shouldn't be directed to .../cbc but to something else. I haven't found anything on that.

So what am I doing wrong? I couldn't find any other documentation on how to use the threads. Hopefully this is a stupid question with an easy solution. Thanks for your help.

russellthehippo
  • 402
  • 4
  • 10
  • Hey, is there a tutorial on how to compile enabling the multicore? the link provided doesnt work – mrbTT Jul 05 '22 at 01:01

1 Answers1

4

Looks all the work was done during preprocessing. Parallel threads only kick in during the branch-and-bound phase, after preprocessing. Try a model or dataset where CBC has to do some real branching. I.e. where the number of nodes is significant. For most larger MIP models, CBC will need to explore a large number of nodes. In that case parallel threads can make a difference. But in some cases it may also lead to worse performance (see link).

Erwin Kalvelagen
  • 15,677
  • 2
  • 14
  • 39
  • 1
    Okay, thanks! This test was optimizing for 10,000 targets; my real data is about 6,000,000. I tried it with ~750,000 and it ended up using more threads, but like the link (very helpful!) I experienced worse performance. What I ended up doing was testing ways to manual implement multiprocessing to optimize smaller segments (10,000) at a time with the built-in Python pool.apply method, then combine the results. Optimality suffered somewhat, but only ~.01% but it sped up time massively. Unique to my problem, obviously, but it worked for me. – russellthehippo Feb 20 '19 at 10:59