I am looking for a framework to be used in a C++ distributed number crunching application.
The setup looks as follows:
There is a master node which divides the problem domain into small independent tasks. The tasks are distibuted to worker nodes of different capability (e.g. CPU type/GPU-enabled). Worker nodes are dynamically added to the compute grid, as they become available. It may also happen that a worker node dies, without saying good bye.
I am searching for a fast C/C++ framework to accomplish this setup.
To summarize, my main requirements are:
- Worker/Task-scheduling paradigm
- Dynamically add/remove nodes
- Target network: 1G - 10G ethernet (corporate network, good performance over internet not required)
- Optional: Encrypted and authenticated communication