I'm working on a bruteforce algorithm for solving a kind of Puzzle. That puzzle is a rectangle, and for some reasons irrelevant here, the number of possible solutions of a rectangle whose size is width*height is 2^(min(width, height)) instead of 2^(width*height). Both dimensions can be considered as in range 1..50. (most often below 30 though) This way, the numbers of solutions is, at worst, 2^50 (about 1 000 000 000 000 000 so). I store solution as an unsigned 64 bits number, a kind of "seed"
I have two working algortihms for bruteforce solving. Assuming N is min(width, height) and isCorrect(uint64_t) a predicate that returns whether the solution with given seed is correct or not.
The most naive algorithm is roughly this :
vector<uint64_t> solutions;
for (uint64_t i = 0; i < (1 << N); ++i)
{
if (isCorrect(i))
solutions.push_back(i);
}
It works perfectly (assuming predicate is actually implemented :D) but does not profit from multiples cores, so I'd like to have a multi-threadead approach. I've come across QtConcurrent, which gives concurrent filter and map functions, that automatically create optimal number of threads to share burden.
So I have a new algorithm that is roughly this :
vector<unit64_t> solutionsToTry;
solutionsToTry.reserve(1 << N);
for (uint64_t i = 0; i < (1 << N); ++i)
solutionsToTry.push_back(i);
//Now, filtering
QFuture<unit64_t> solutions = QtConcurrent::filtered(solutionsToTry, &isCorrect);
It does work too, and a bit faster, but when N goes higher than 20, There's simply not enough room in my RAM to allocate the vecotr (with N = 20 and 64 bits numbes, I need 8,3 GB of RAM. It's okay with swap partitions etc, but sinces it gets multiplied by 2 every time N increases by 1, it can't go further)
Is there a simple way to have concurrent filtering without bloating memory ?
If there isn't, I might rather hand-split loops on 4 threads to get concurrency without optimal size, or write the algorithm in Haskell to get lazy-evaluation and filtering of infinite lists :-)