6

I'm trying to set up "inverted" PUB/SUB with ZeroMQ.

Meaning that subscribing (SUB) sockets belong to several long-living servers, doing zmq_bind(); and publishing (PUB) socket is a short-lived client and does zmq_connect().

I use a single ipc:// socket.

I expect that a message from the publisher will reach each of subscribers.

The problem: Only one of subscriber processes ever receives messages. If that process dies, publisher gets stuck in zmq_term().

Is this mode of operations supported by zmq? If yes, then what am I doing wrong? If no, then how to implement what I need?

Minimal example with some additional details (in Lua, but this should not matter): https://gist.github.com/938429

Wooble
  • 87,717
  • 12
  • 108
  • 131
Alexander Gladysh
  • 39,865
  • 32
  • 103
  • 160

3 Answers3

6

You can't bind multiple sockets to one ipc:// address (we are talking about Unix Domain Socket here ipc:///tmp/test.ipc == file /tmp/test.ipc).

What you can do is bind each SUB socket to a different ipc:// address and have the publisher connect one PUB socket to each of those addresses. ZeroMQ allows one socket to bind/connect to multiple addresses.

The blocking on zmq_term() is most likely do to lingering close issue (i.e. there is a message that the PUB socket is trying to send). Take a look at the ZMQ_LINGER socket option.

Uli Köhler
  • 13,012
  • 16
  • 70
  • 120
Neopallium
  • 1,419
  • 9
  • 18
  • Yep, each bind just overrides the file. Should have figured that out myself, sorry for the noise. – Alexander Gladysh Apr 23 '11 at 08:06
  • You're not the first person to be hit by this; it's quite inconsistent with other transports and thus not obvious... there's been discussion on the list about changing this. – Pieter Hintjens Nov 22 '12 at 05:12
4

There's a 'feature' of the ipc:// transport which is that if two processes bind to the same IPC endpoint, the second one will silently steal the binding from the first one. This 'feature' was made to allow processes to recover easily after a crash.

This is why only one subscriber is getting the messages.

Since you have just one publisher, why not bind the publisher, and connect the subscribers to that? Even if the publisher comes and goes, the subscribers will reconnect automatically.

Pieter Hintjens
  • 6,599
  • 1
  • 24
  • 29
2

You can't bind multiple sockets to the same address on one machine, either it's ipc or tcp, SUB/PUB or REQ/REP. It's just like the bind of a network socket.

The way to send messages to all subscribers from many publishers, is to implement a simple broker, which binds to a SUB address and a PUB address. Publishers connect to the SUB socket to send messages, and subscribers connect to the PUB socket of the same broker, and the broker simply forwards all messages received from SUB socket to the PUB socket. It takes some performance overhead, but is quite easy to program.

In ZeroMQ 2.0 there's an executable zmq_forwarder can be used just for this purpose, in 2.1 refer to zmq_device(3) function.

Gary Shi
  • 645
  • 6
  • 13
  • If I understand this correctly, that broker process will be a single point of failure. Is there a way to avoid this? – Alexander Gladysh Apr 25 '11 at 16:22
  • I think so. But since all your programs run on a single computer (with ipc:// socket), hardware and network failures won't be the problem. The broker process itself might be too simple to fail. However, if it involves many nodes, I'm not sure whether this is the best way. – Gary Shi Apr 26 '11 at 23:36
  • There is always a Chaos Monkey, even on a single machine. (Just kidding :-) ) – Alexander Gladysh Apr 27 '11 at 04:45
  • It really is worthwhile to read through the ZMQ guide and get an overbiew of how to structure all the various communications possibilities. http://zguide.zeromq.org/page:all – Michael Dillon Jul 01 '11 at 02:12