The Swarm first:
First of all, a smart solution of the massive herd of agents requires both flexibility ( in swarm framework design for features' additions ) and efficiency ( for both scalability and speed ) so as to achieve fastest possible simulation run-times, in spite of PTIME
& PSPACE
obstacles, with possible risk of wandering into EXPTIME
zone in more complex inter-agent communication schemes.
Efficiency next:
At the first moment, my guess was to rather use and customise a bit light-weight-er POSIX-based signalling/messaging framework nanomsg
-- a younger sister of ZeroMQ
from Martin SUSTRIK, co-father of ZeroMQ -- where a Context()
-less design plus additional features alike SURVEY
and BUS
messaging archetype are of particular attractivity for swarms with your own software-designed problem-domain-specific messaging/signalling protocols:

100k+ with file_descriptor
s
Well, you need a courage. Doable, but sleeves up, it will require hands on efforts in kernel, tuning in system settings and you will pay for having such scale by increased overheads.
Andrew Hacking has explained both the PROS and CONS of the "just" increasing fd
count ( not only on the kernel side of the system tuning and configuration ).
Other factors to consider are that while some software may use sysconf(OPEN_MAX)
to dynamically determine the number of files that may be open by a process, a lot of software still uses the C
library's default FD_SETSIZE
, which is typically 1024 descriptors and as such can never have more than that many files open regardless of any administratively defined higher limit.
Andrew has also directed your kind attention to this, that may serve as an ultimate report on how to setup a system for 100k-200k connections.
Do static scales above 100k per host make any real sense for swarm simulations?
While still "technically" doable, there are further limits -- even nanomsg
will not be able to push more than about 1.000.000 [MSGs/s]
which is fairly well enough for most applications, that cannot keep the pace with this native speed of message-dispatch. Citations state some ~6 [us]
for CPU-core to CPU-core transfer latencies and if the user-designed swarm-processing application cannot make the sending loop under some 3-4 [us]
the performance ceiling is not anywhere close to cause an issue.
How to scale above that?
A distributed multi-host processing is the first dimension to attack the static scale of the swarm. Next would be a need to introduce an RDMA-injection so as to escape from the performance bottleneck of any stack-processing in the implementation of the distributed messaging / signalling. Yes, this can move your Swarm system into nanosecond-scale latencies zone, but at the cost of building an HPC / high-tech computing infrastructure ( which would be a great Project, if your Project sponsor can adjust financing of such undertaking -- + pls. pls. do let me know if yes, would be more than keen to join such swarm intelligence HPC-lab ), but worth to know about this implication before deciding on architecture and knowing the ultimate limits is the key to do it well from the very beginning.