CP-SAT: Higher num_search_workers value increases the resolution time

Question

Dears,
I'm playing with num_search_workers parameter and I discovered a strange behaviour with or-tool 7.5 on windows.
I did the following tests on a 32 core machine and I discovered that 1 thread has the best performances.
Do you know why?:

start to solve using 1 threads ... solved in 13.578 secs

start to solve using 2 threads ... solved in 45.832 secs

start to solve using 4 threads ... solved in 53.031 secs

start to solve using 8 threads ... solved in 62.013 secs

start to solve using 16 threads ... solved in 157.5 secs

start to solve using 32 threads ... solved in 807.778 secs

start to solve using 64 threads ... solved in 386.252 secs

the model is more or less like the following:
consider that self.suggested_decisions is a dictionary of BoolVars (the decision variables) the problem is like:

model.Add(sum(self.scenario.constants['scaling_factor']*self.suggested_decisions[r][0] for r in self.all_records)>=sum(sum(self.suggested_decisions[r][d]*(int(0.60*self.scenario.constants['scaling_factor']))for r in self.all_records) for d in self.all_decisions))
model.Add(sum(int(self.scenario.dataset['AMOUNT_FINANCED'][r])*self.suggested_decisions[r][0] for r in self.all_records)>=2375361256)
model.Add(sum(self.scenario.constants['scaling_factor']*self.scenario.dataset['Bad'][r]*self.suggested_decisions[r][0] for r in self.all_records)<=sum(self.suggested_decisions[r][0]*int(self.scenario.constants['scaling_factor']*0.038) for r in self.all_records))
model.Maximize(sum(int(self.scenario.dataset[\'AMOUNT_FINANCED\'][r])*self.suggested_decisions[r][0] for r in self.all_records))

score 1 · Answer 1 · answered Mar 12 '20 at 17:17

1

Welcome to the world of parallelism.

1 to 8 threads, you are just unlucky. Communications between workers change the search and slows it down.

Above 8 threads, you are most likely memory bound.

This being said, this is very rare.

Could you send me the model ?

answered Mar 12 '20 at 17:17

Laurent Perron

8,594
1
8
22

added a portion of the code in my question. It is basically a binary problem with linear constraints. Is there a criteria to choose when to use one or more threads. thanks!!! – stefano guerrieri Mar 12 '20 at 18:01
@stefanoguerrieri to export a model do: `with open("model.proto", "w") as f: f.write(str(model.Proto()))` and you can send it to him via email (not sure) or attaching it on an issue https://github.com/google/or-tools/issues/ – Stradivari Mar 12 '20 at 18:23
thanks! I created an issue: https://github.com/google/or-tools/issues/1921 – stefano guerrieri Mar 12 '20 at 21:43
Got it. very interesting model. Multiple overlapping facts. the default searchin 1 thread is incredibly lucky. The model is a large linear system. The first threads you add tend to be slow on the linear part (branching, MIP cuts...). Above 8 threads, I do not experience your slowdown (make sure you report walltime and not cputime), so my system may be less memory bound. – Laurent Perron Mar 13 '20 at 08:01
yes the time is considered as end -start when I run the solve method – stefano guerrieri Mar 13 '20 at 13:33
my system has 128GB of RAM – stefano guerrieri Mar 13 '20 at 13:45
I meant memory bandwidth bound. – Laurent Perron Mar 13 '20 at 14:05
clear. Could it be that the reason why one thread is very lucky is related to the fact that I'm hinting the solution for thousands of variable, while with multi thread this is in some way ignored? – stefano guerrieri Mar 17 '20 at 08:01
Could be. I do not remember what we do with hinting in parallel. – Laurent Perron Mar 17 '20 at 08:31

CP-SAT: Higher num_search_workers value increases the resolution time

1 Answers1