0

I need to improve the search speed for negamax algorithm, and I see that stockfish has used multithreading to do this.

However when I tried spawning a thread for every child of a current node, that slowed down search time because of constant creation and destruction of threads which is slow.

I already have alpha beta pruning, transposition table, move ordering, etc.

How do I further improve negamax performance with threads?

Thanks

code_vader
  • 256
  • 1
  • 8
  • 4
    Consider using a thread pool. The threads are preallocated, so there's only a bit of overhead queueing jobs to the pool which then runs them in a thread. – user4581301 Jul 19 '21 at 17:22
  • First off, there is no guarantee that the OS will run threads on separate cores. The OS can treat the threads as separate tasks and use round-robin scheduling on them. I recommend researching your OS to find out how marry a core to a thread. – Thomas Matthews Jul 19 '21 at 17:40
  • Your threads should have more execution content than the overhead to create, launch and maintain them. Otherwise, they'll slow down your executable more than not having threads. – Thomas Matthews Jul 19 '21 at 17:41
  • Yes, this seems to be the problem, but I am not sure where to spawn the thread in the search for optimal performance – code_vader Jul 19 '21 at 17:43
  • Let's say you do get separate cores for your threads. The cores and threads must share memory. In most systems, they share the same *data bus* and *address bus*. This means that other activities must wait while a thread accesses memory. This may negatively affect your execution performance (more threads means more waiting for other tasks). – Thomas Matthews Jul 19 '21 at 17:44
  • Maybe use the thread pool way (1 per core) and recycle after every use to speed up? – code_vader Jul 19 '21 at 17:51
  • Stockfish is open source. Feel free to take a look at the code to see how multithreading is implemented. – Olivier Jul 19 '21 at 18:16
  • 1
    Where Thomas is headed is "Threading is far more complicated than it looks." You can't usually graft multithreading onto code after the fact. If you don't design thread awareness in from the beginning you'll often find yourself writing and rewriting long past the point in time where alternate reality you threw out the original and rewrote it from scratch with threading in mind. – user4581301 Jul 19 '21 at 18:19
  • 1
    Consider using *OpenMP*. OpenMP runtimes already implement a thread pool optimization. It also supports tasks with dependencies and private data. Introducing parallelism in a sequential code is often relatively simple and not very intrusive (assuming you know how to parallelize the algorithm, compared to libraries). OpenMP runtimes are quite well optimized. OpenMP is frequently used in high-performance codes. It is supported by GCC, Clang and ICC. MSVC has a minimal (very old) support. – Jérôme Richard Jul 19 '21 at 22:24

0 Answers0