Pub Sub implementation zero mq 3.xx

Question

I have been working with qpid and now i am trying to move to broker less messaging system , but I am really confused about network traffic in a Pub Sub pattern. I read the following document : http://www.250bpm.com/pubsub#toc4 and am really confused how subscription forwarding is actually done ?

I thought zero mq has to be agnostic for the underlying network topology but it seems it is not. How does every node knows what to forward and what to not (for e.g. : in eth network , where there can be millions subscriber and publisher , message tree does not sound a feasible to me . What about the hops that do not even know about the existence of zero mq , how would they forward packets to subscribers connected to them , for them it would be just a normal packet , so they would just forward multiple copies of data packets even if its the same packet ? I am not networking expert so may be I am missing something obvious about message tree and how it is even created ? Could you please give certain example cases how this distribution tree is created and exactly which nodes are xpub and xsub sockets created ?

Is device (term used in the link) something like a broker , in the whole article it seemed like device is just any general intermediary hop which does not know anything about zero mq sockets (just a random network hop) , if it is indeed a broker kind of thing , does that mean for pub sub , all nodes in messaging tree have to satisfy the definition of being a device and hence it is not a broke less design ?

Also in the tree diagram (from the link , which consist P,D,C) , I initially assumed C and C are two subscribers and P the only publisher (D just random hop), but now it seems that we have D as the zero mq . Does C subscribes to D and D subscribes to P ? or both the C just subscribe to P (To be more generic , does each node subscribe to its parent only in the ). Sorry for the novice question but it seems i am missing on something obvious here, it would be nice if some one can give more insights.

score 1 · Answer 1 · answered Apr 17 '12 at 15:27

zeromq uses the network to establish a connection between nodes directly (e.g via tcp), but only ever between 1 sender and 1-n receivers. These are connected "directly" and can exchange messages using the underlying protocol.

Now when you subscribe to only certain events in a pub-sub scenario, zeromq used to filter out messages subscriber side causing unnecessary network traffic from the publisher to at least a number of subscribers.

In newer versions of zeromq (3.0 and 3.1) the subscriber process sends its subscription list to the publisher, which manages a list of subscribers and the topics they are interested in. Thus the publisher can discard messages that are not subscribed too by any subscriber and potentially send targeted messages at only interested subscribers.

When the publisher is itself a subscriber of events (e.g. a forwarding or routing device service) it might forward those subscriptions again by similarly subscribing to its connected publishers.

I am not sure whether zeromq still does client side filtering in newer versions even if it "forwards" its subscriptions though.

So in the messaging tree diagram the intermediate devices are infact publishers that are subscribers too? So for a network like a->b->c where a is the only publisher and b,c are just subscribers to same topic , the messages will be sent twice by a (once to b and once to c even though it would be smarter to send message just once and b could have just passed it through but since b doesn't know about existence of c, the message will be coming to the hop b twice) ? — user179156, Apr 17 '12 at 16:02
@user179156 you can't connect a subscriber socket to another subscriber socket, so your a->b->c example doesn't make sense. I guess you meant b is a router. In that case, a only sends the message once, and is not even aware of c. — Oktalist, May 08 '13 at 13:37
@user179156 for the b<-a->c case in which the two connections share some infrastructure, the MQ can't be expected to know about that, it's up to you to design your zeromq socket topology to be a better fit for your situation. You might want a broker. — Oktalist, May 08 '13 at 13:46

score 0 · Answer 2 · answered Apr 17 '12 at 23:19

0

A more efficient mechanism for pub/sub to multiple subscribers is to use multicast whereby a single message traverses the network and is received by all subscribers (who can then filter what they wish).

ZeroMQ supports a standardised reliable multicast called Pragmatic General Multicast.

These references should give you an idea how it all works. Note that multicast generally only works on a single subLAN and may need router configuration or TCP bridges to span multiple subLANs.

answered Apr 17 '12 at 23:19

scaganoff

1,830
1
16
19

not a very elegant way since all messages are received by the receiver and burden is on receiver to drop the message , impacts the latency requirements :( – user179156 Apr 17 '12 at 23:43
Sure..tradeoff between network resource utilization and receiver resource utilization. You choose depending on your requirements. – scaganoff Apr 18 '12 at 07:45
Your question refers to "millions of subscribers". What's the latency of a publisher managing millions of direct tcp connections and duplicating messages to each? And the poor subscriber at the end gets the message way after the rest. Multicast is a very elegant mechanism for pub/sub to large numbers of subscribers. ZeroMQ allows for pluggable transport: tcp, pgm or whatever. Choose what's best for your requirements. – scaganoff Apr 18 '12 at 23:49
what about 1000 subscribes and about 1000 topics, multicasting messages to all 1000 subscribers will create too much network traffic (messages are being published about 100000/sec) – user179156 Apr 19 '12 at 02:16
3

user179156, you don't appear to understand how multicast works. If you have 1000 clients and are using multicast, each message sent to all 1000 (or a part of them) is one transmission. That's why multicast is so much more efficient -- 1000, 10,000, or 100,000 would all consume about the same amount of traffic. Retransmissions may add to that, if your network is not reliable, but most local-ethernets are. – Michael Graff Nov 17 '13 at 13:26

Pub Sub implementation zero mq 3.xx

2 Answers2