Can I prevent ZeroMQ from occupying file descriptors?

Question

I have a setup where a master process is setting up a ZMQ_ROUTER and then forks many child processes, which then connect to that router.

Whenever a child zmq_connect()'s to the master, one file descriptor is occupied.

This however limits the number of interacting processes to the number of allowed file descriptors ( per process ). For me ( linux ), this currently is just 1024.

That is way too small for my intended use ( a multi-agent / swarm simulation ).

Answer:

You can't, except when using an inter-thread socket type ( using an inproc:// transport-class ). All other protocols use one file descriptor per connection.

One new approach to reduce the number of necessary file descriptors per application, if that application has several services ( e.g. several tcp://<address:port> connections can be made to ), seems to be to use the resource property resource property, which allows one to combine several services to one endpoint.

Sounds attractive to have message-passing intelligent swarm. Anyway, alex, the direction goes into double-trouble, with the pain coming from both **`PTIME` & `PSPACE`** directions in the `ZMQ_FD` sense and potentially the `EXPTIME` horror may come from the messaging flow -- **what is the order of magnitude of your target numbers?** `1E4`? `1E5`? `1E6`? Will love to hear about your Project / Research more, alex. — user3666197, Jul 20 '16 at 04:42
Yes, I'm aware of the fact that splitting a given task into a multi-agent thing may increase complexity. I heard arguments in both directions; Some people say 'we need to split it up, otherwise we can't get a flexible/capable enough solution', others say: 'bring it together otherwise you can't control complexity'. I guess both is true. Depends on the application. — alex, Jul 20 '16 at 08:06
There was absolutely no objection on this per se, but need to know the projected orders of magnitude your swarms are going to grow into. **So?** — user3666197, Jul 20 '16 at 08:39
Ah, I missed this information. I would say 10k is a must 100k is good everything above would be great. — alex, Jul 20 '16 at 16:34
Thanks for clarification, **100k+ is nice and doable**and can keep growing, as a principally distributed-system, by further scaling, depending on how complex the swarm inter-agent signalling / communication layer ought be. — user3666197, Jul 21 '16 at 07:14

score 3 · Answer 1 · edited May 23 '17 at 10:30

The Swarm first:

First of all, a smart solution of the massive herd of agents requires both flexibility ( in swarm framework design for features' additions ) and efficiency ( for both scalability and speed ) so as to achieve fastest possible simulation run-times, in spite of PTIME & PSPACE obstacles, with possible risk of wandering into EXPTIME zone in more complex inter-agent communication schemes.

Efficiency next:

At the first moment, my guess was to rather use and customise a bit light-weight-er POSIX-based signalling/messaging framework nanomsg -- a younger sister of ZeroMQ from Martin SUSTRIK, co-father of ZeroMQ -- where a Context()-less design plus additional features alike SURVEY and BUS messaging archetype are of particular attractivity for swarms with your own software-designed problem-domain-specific messaging/signalling protocols:

100k+ with `file_descriptor`s

Well, you need a courage. Doable, but sleeves up, it will require hands on efforts in kernel, tuning in system settings and you will pay for having such scale by increased overheads.

Andrew Hacking has explained both the PROS and CONS of the "just" increasing fd count ( not only on the kernel side of the system tuning and configuration ).

Other factors to consider are that while some software may use sysconf(OPEN_MAX) to dynamically determine the number of files that may be open by a process, a lot of software still uses the C library's default FD_SETSIZE, which is typically 1024 descriptors and as such can never have more than that many files open regardless of any administratively defined higher limit.

Andrew has also directed your kind attention to this, that may serve as an ultimate report on how to setup a system for 100k-200k connections.

Do static scales above 100k per host make any real sense for swarm simulations?

While still "technically" doable, there are further limits -- even nanomsg will not be able to push more than about 1.000.000 [MSGs/s] which is fairly well enough for most applications, that cannot keep the pace with this native speed of message-dispatch. Citations state some ~6 [us] for CPU-core to CPU-core transfer latencies and if the user-designed swarm-processing application cannot make the sending loop under some 3-4 [us] the performance ceiling is not anywhere close to cause an issue.

How to scale above that?

A distributed multi-host processing is the first dimension to attack the static scale of the swarm. Next would be a need to introduce an RDMA-injection so as to escape from the performance bottleneck of any stack-processing in the implementation of the distributed messaging / signalling. Yes, this can move your Swarm system into nanosecond-scale latencies zone, but at the cost of building an HPC / high-tech computing infrastructure ( which would be a great Project, if your Project sponsor can adjust financing of such undertaking -- + pls. pls. do let me know if yes, would be more than keen to join such swarm intelligence HPC-lab ), but worth to know about this implication before deciding on architecture and knowing the ultimate limits is the key to do it well from the very beginning.