As the counts of connections / messages / sizes grow larger and larger, some default guesstimates typically cease to suffice. Try to extend some otherwise working defaults on the PUB
-side configuration, where the problem seems to start choking ( do not forget that since v3.?+ the subscription-list processing got transferred from the SUB
-side(s) to the central PUB
-side. That reduces the volumes of data-flow, yet at some (here growing to remarkable amounts) additional costs on the PUB
-side ~ RAM-for-buffers + CPU-for-TOPIC-list-filtering...
So, let's start with these steps on the PUB
-side :
aSock2SUBs = zmq.Context( _tweak_nIOthreads ).socket( zmq.PUB ) # MORE CPU POWER
aSock2SUBs.setsockopt( zmq.SNDBUF, _tweak_SIZE_with_SO_SNDBUF ) # ROOM IN SNDBUF
And last but not least, PUB
-s do silently drop any messages, that do not "fit" under its current HighWaterMark level, so let's tweak this one too :
aSock2SUBs.setsockopt( zmq.SNDHWM, _tweak_HWM_till_no_DROPs ) # TILL NO DROPS
Other { TCP_* | TOS | RECONNECT_IVL* | BACKLOG | IMMEDIATE | HEARTBEAT_* | ... }
-low-level parameter settings may help further to make your herd of 12k+ SUB
-s live in peace side by side with other (both friendly & hostile ) traffic and make your application more robust, than if relying just on pre-cooked API-defaults.
Consult both the ZeroMQ API documentation altogether also with the O/S defaults, as many of these ZeroMQ low-level attributes also rely on the O/S actual configuration values.
You shall also be warned, that making 12k+ threads in Python still leaves a purely [SERIAL]
code execution, as the Python central GIL-lock ownership (exclusive) avoids (yes, principally avoids) any form of [CONCURRENT]
co-execution, as the very ownership of the GIL-lock is exclusive and re-[SERIAL]
-ises any amount of threads into a waiting queue and results in a plain sequence of chunks' execution ( By default, Python 2 will switch threads every 100 instructions. Since Python 3.2+, by default, the GIL will be released after 5 milliseconds ( 5,000 [us] ) so that other thread can have a chance to try & also acquire the GIL-lock. You can change these defaults, if the war of 12k+ threads on swapping the ownership of the GIL-lock actually results in "almost-blocking" any and all of the TCP/IP-instrumentation for message buffering, stacking, sending, re-transmit-ing until an in time confirmed reception. One may test it until a bleeding edge, yet choosing some safer ceiling might help if other parameters have been well adjusted for robustness.
Last but not least, enjoy the Zen-of-Zero, the masterpiece of Martin SUSTRIK for distributed-computing, so well crafted for ultimately scalable, almost zero-latency, very comfortable, widely ported signalling & messaging framework.