10

I'd like to write a python script (call it parent) that does the following:

(1) defines a multi-dimensional numpy array

(2) forks 10 different python scripts (call them children). Each of them must be able to read the contents of the numpy array from (1) at any single point in time (as long as they are alive).

(3) each of the child scripts will do it's own work (children DO NOT share any info with each other)

(4) at any point in time, the parent script must be able to accept messages from all of its children. These messages will be parsed by the parent and cause the numpy array from (1) to change.


How do I go about this, when working in python in a Linux environment? I thought of using zeroMQ and have the parent be a single subscriber while the children will all be publishers; does it make sense or is there a better way for this?

Also, how do I allow all the children to continuously read the contents of the numpy array that was defined by the parent ?

user3262424
  • 7,223
  • 16
  • 54
  • 84
  • Have you considered using the PUSH/PULL model, as described in http://zguide.zeromq.org/page:all#Divide-and-Conquer? Only the ventilator and sink are the same process... – Fabien Bouleau Jun 24 '19 at 14:56

3 Answers3

19

The sub channel doesn't have to be the one to bind, so you can have the subscriber bind, and each of the children pub channels can connect to that and send their messages. In this particular case, I think the multiprocessing module is a better fit, but I thought it bore mentioning:

import zmq
import threading

# So that you can copy-and-paste this into an interactive session, I'm
# using threading, but obviously that's not what you'd use

# I'm the subscriber that multiple clients are writing to
def parent():
    context = zmq.Context()
    socket = context.socket(zmq.SUB)
    socket.setsockopt(zmq.SUBSCRIBE, 'Child:')
    # Even though I'm the subscriber, I'm allowed to get this party 
    # started with `bind`
    socket.bind('tcp://127.0.0.1:5000')

    # I expect 50 messages
    for i in range(50):
        print 'Parent received: %s' % socket.recv()

# I'm a child publisher
def child(number):
    context = zmq.Context()
    socket = context.socket(zmq.PUB)
    # And even though I'm the publisher, I can do the connecting rather
    # than the binding
    socket.connect('tcp://127.0.0.1:5000')

    for data in range(5):
        socket.send('Child: %i %i' % (number, data))
    socket.close()

threads = [threading.Thread(target=parent)] + [threading.Thread(target=child, args=(i,)) for i in range(10)]
for thread in threads:
    thread.start()

for thread in threads:
    thread.join()

In particular, the Core Messaging Patterns part of the documentation discusses the fact that for the patterns, either side can bind (and the other connect).

Dan Lecocq
  • 3,383
  • 25
  • 22
  • I think you mean to say... "The PUB channel doesn't have to be the one to bind" – bremen_matt Feb 08 '17 at 14:33
  • 1
    I copied this into a py script and ran it. It just hangs on the recv command and never prints anything. – qwerty9967 Mar 16 '17 at 15:39
  • 1
    Its probably because you didn't wait for the connect to finish. I've added time.sleep(0.1) after the connect and it worked. – wolfoorin Jul 23 '17 at 10:47
  • Working for me on a subnet between nodes – Josh Usre Oct 16 '17 at 19:41
  • Note that you probably wont experience a great performance boost, as the GIL reduces the chances that your threads really run in parallel. As mentioned above you can avoid that by using the `multiprocessing` module. See e.g. https://emptysqua.re/blog/grok-the-gil-fast-thread-safe-python/ for a good explanation of the GIL. – Marti Nito Sep 24 '18 at 13:48
4

I think it makes more sense to use PUSH/PULL sockets, as you have a standard Ventilator - Workers - Sink scenario, except that the Ventilator and the Sink are the same process.

Also, consider using the multiprocessing module instead of ZeroMQ. It will probably be a bit easier.

tsg
  • 2,007
  • 13
  • 12
  • tsg, thank you for your reply. `multiprocessing` is good -- but can I use it to share a `numpy` array? my shared object (ideally, `numpy` array) will have a matrix of 20,000,000 x 5 items. A regular `list` will be a huge memory waste. – user3262424 Jul 14 '11 at 22:06
  • also, how do I share the **parent** `numpy` array for READING among all **children**? – user3262424 Jul 14 '11 at 22:10
  • The first link is dead – Daniel Giger Jan 24 '22 at 21:57
-2

In ZeroMQ there can only be one publisher per port. The only (ugly) workaround is to start each child PUB socket on a different port and have the parent listen on all those ports.

but the pipeline pattern describe on 0MQ, user guide is a much better way to do this.

fccoelho
  • 6,012
  • 10
  • 55
  • 67
  • 6
    This is not quite accurate. You can only *bind* one socket to a port, but in this scenario, there is no reason not to bind with the SUB, and connect all the PUBs to that one port. ZeroMQ doesn't care at all about bind/connect directions. – minrk Nov 11 '11 at 17:41