I have a classic minimax problem solver with additional alpha-beta pruning implementation.
I parallelized the algorithm in the following way:
- Do iterative deepening until we have more nodes than available threads
- Run one minimax per thread in batches of N threads. So if we get 9 possible moves at depth 2 from the serial search, we first start 4 threads, then another 4 and then 1 on the end, each starting at depth 2 with their own parameters.
It turns out that the speedup S=T(serial)/T(parallel) for 4 threads is 4.77 so I am basically breaking Amdahl's law here.
If we say that implementation is not broken in some way, I suspect Alpha-Beta pruning is doing the magic here? Due to starting several searches in parallel, there is more pruning and sooner? That is my theory but I'd love if someone could confirm this in more detail.
Just to clarify:
Minimax without alpha-beta implementation is basically doing depth-first search of the whole tree up to some max depth. With alpha-beta it's doing the same except it prunes some branches which will lead to a worse result anyway.
Edit: After further examination of the code I had a bug on one line of code which caused the program to "cheat" and not follow some moves. Actual speedup factor is 3.6 now. Sorry for wasting everyone's time.. no breakthrough in computing today. :/