How does PubSub work in BookSleeve/ Redis?

Question

I wonder what the best way is to publish and subscribe to channels using BookSleeve. I currently implement several static methods (see below) that let me publish content to a specific channel with the newly created channel being stored in private static Dictionary<string, RedisSubscriberConnection> subscribedChannels;.

Is this the right approach, given I want to publish to channels and subscribe to channels within the same application (note: my wrapper is a static class). Is it enough to create one channel even I want to publish and subscribe? Obviously I would not publish to the same channel than I would subscribe to within the same application. But I tested it and it worked:

 RedisClient.SubscribeToChannel("Test").Wait();
 RedisClient.Publish("Test", "Test Message");

and it worked.

Here my questions:

1) Will it be more efficient to setup a dedicated publish channel and a dedicated subscribe channel rather than using one channel for both?

2) What is the difference between "channel" and "PatternSubscription" semantically? My understanding is that I can subscribe to several "topics" through PatternSubscription() on the same channel, correct? But if I want to have different callbacks invoked for each "topic" I would have to setup a channel for each topic correct? Is that efficient or would you advise against that?

Here the code snippets.

Thanks!!!

    public static Task<long> Publish(string channel, byte[] message)
    {
        return connection.Publish(channel, message);
    }

    public static Task SubscribeToChannel(string channelName)
    {
        string subscriptionString = ChannelSubscriptionString(channelName);

        RedisSubscriberConnection channel = connection.GetOpenSubscriberChannel();

        subscribedChannels[subscriptionString] = channel;

        return channel.PatternSubscribe(subscriptionString, OnSubscribedChannelMessage);
    }

    public static Task UnsubscribeFromChannel(string channelName)
    {
        string subscriptionString = ChannelSubscriptionString(channelName);

        if (subscribedChannels.Keys.Contains(subscriptionString))
        {
            RedisSubscriberConnection channel = subscribedChannels[subscriptionString];

            Task  task = channel.PatternUnsubscribe(subscriptionString);

            //remove channel subscription
            channel.Close(true);
            subscribedChannels.Remove(subscriptionString);

            return task;
        }
        else
        {
            return null;
        }
    }

    private static string ChannelSubscriptionString(string channelName)
    {
        return channelName + "*";
    }

Marc Gravell · Accepted Answer · 2013-04-10T07:21:31.443

1: there is only one channel in your example (Test); a channel is just the name used for a particular pub/sub exchange. It is, however, necessary to use 2 connections due to specifics of how the redis API works. A connection that has any subscriptions cannot do anything else except:

listen to messages
manage its own subscriptions (subscribe, psubscribe, unsubscribe, punsubscribe)

However, I don't understand this:

private static Dictionary<string, RedisSubscriberConnection>

You shouldn't need more than one subscriber connection unless you are catering for something specific to you. A single subscriber connection can handle an arbitrary number of subscriptions. A quick check on client list on one of my servers, and I have one connection with (at time of writing) 23,002 subscriptions. Which could probably be reduced, but: it works.

2: pattern subscriptions support wildcards; so rather than subscribing to /topic/1, /topic/2/ etc you could subscribe to /topic/*. The name of the actual channel used by publish is provided to the receiver as part of the callback signature.

Either can work. It should be noted that the performance of publish is impacted by the total number of unique subscriptions - but frankly it is still stupidly fast (as in: 0ms) even if you have tens of multiple thousands of subscribed channels using subscribe rather than psubscribe.

But from publish

Time complexity: O(N+M) where N is the number of clients subscribed to the receiving channel and M is the total number of subscribed patterns (by any client).

I recommend reading the redis documentation of pub/sub.

Edit for follow on questions:

a) I assume I would have to "publish" synchronously (using Result or Wait()) if I want to guarantee the order of sending items from the same publisher is preserved when receiving items, correct?

that won't make any difference at all; since you mention Result / Wait(), I assume you're talking about BookSleeve - in which case the multiplexer already preserves command order. Redis itself is single threaded, and will always process commands on a single connection in order. However: the callbacks on the subscriber may be executed asynchronously and may be handed (separately) to a worker thread. I am currently investigating whether I can force this to be in-order from RedisSubscriberConnection.

Update: from 1.3.22 onwards you can set the CompletionMode to PreserveOrder - then all callbacks will be completed sequentially rather than concurrently.

b) after making adjustments according to your suggestions I get a great performance when publishing few items regardless of the size of the payload. However, when sending 100,000 or more items by the same publisher performance drops rapidly (down to 7-8 seconds just to send from my machine).

Firstly, that time sounds high - testing locally I get (for 100,000 publications, including waiting for the response for all of them) 1766ms (local) or 1219ms (remote) (that might sound counter-intuitive, but my "local" isn't running the same version of redis; my "remote" is 2.6.12 on Centos; my "local" is 2.6.8-pre2 on Windows).

I can't make your actual server faster or speed up the network, but: in case this is packet fragmentation, I have added (just for you) a SuspendFlush() / ResumeFlush() pair. This disables eager-flushing (i.e. when the send-queue is empty; other types of flushing still happen); you might find this helps:

conn.SuspendFlush();
try {
    // start lots of operations...
} finally {
    conn.ResumeFlush();
}

Note that you shouldn't Wait until you have resumed, because until you call ResumeFlush() there could be some operations still in the send-buffer. With that all in place, I get (for 100,000 operations):

local: 1766ms (eager-flush) vs 1554ms (suspend-flush)
remote: 1219ms (eager-flush) vs 796ms (suspend-flush)

As you can see, it helps more with remote servers, as it will be putting fewer packets through the network.

I cannot use transactions because later on the to-be-published items are not all available at once. Is there a way to optimize with that knowledge in mind?

I think that is addressed by the above - but note that recently CreateBatch was added too. A batch operates a lot like a transaction - just: without the transaction. Again, it is another mechanism to reduce packet fragmentation. In your particular case, I suspect the suspend/resume (on flush) is your best bet.

Do you recommend having one general RedisConnection and one RedisSubscriberConnection or any other configuration to have such wrapper perform desired functions?

As long as you're not performing blocking operations (blpop, brpop, brpoplpush etc), or putting oversized BLOBs down the wire (potentially delaying other operations while it clears), then a single connection of each type usually works pretty well. But YMMV depending on your exact usage requirements.

thank you for the explanation. Couple follow-up questions if I may: a) I assume I would have to "publish" synchronously (using `Result` or `Wait()`) if I want to guarantee the order of sending items from the same publisher is preserved when receiving items, correct? b) after making adjustments according to your suggestions I get a great performance when publishing few items regardless of the size of the payload. However, when sending 100,000 or more items by the same publisher performance drops rapidly (down to 7-8 seconds just to send from my machine). — Matt, Apr 09 '13 at 04:50
...I cannot use transactions because later on the to-be-published items are not all available at once. Is there a way to optimize with that knowledge in mind? Sending 100k items (including the time it takes to move across the local network and the time to receive on a separate process) each of size 16bytes is a lot faster when using ZeroMQ than Redis, so much faster that I believe I may still do something wrong with Redis when publishing items to just one channel. — Matt, Apr 09 '13 at 04:50
And a last question. I wrote a wrapper which is supposed to a) perform general CRUD on stored POCOs, b) publish to a channels, c) subscribe to channels. Do you recommend having one general RedisConnection and one RedisSubscriberConnection or any other configuration to have such wrapper perform desired functions? — Matt, Apr 09 '13 at 05:03
Regarding the order of published items and order of callbacks on the subscriber side, there is definitely a mismatch and given your information it must indeed have to do with the asynchronous callbacks. I tested and the order is not preserved unless I publish using Wait(). — Matt, Apr 09 '13 at 15:46
@Freddy indeed; this was a change in 1.3 and the unexpected impact has been noted - like I say: I'm working on that. My personal code never made assumptions about order, so I didn't notice. I have new tests for it, and will be playing with changing this - probably as an optional thing (the "allow full async" also was intentional and by design). If you use a 1.2.* version it'll be fine. — Marc Gravell, Apr 09 '13 at 16:49
thanks, will try it out. First, I like to play with your suggestions regarding "Flush" as your performance looks a lot better than what I got. — Matt, Apr 09 '13 at 21:34
couple observations after playing with all options: (a) Suspend /Resume Flush is probably not gonna work for me because I need items to be delivered as soon as possible, and as the payload is quite light (16 bytes each) it gets stuck in the buffer unless the buffer gets full or is flushed. (b) I would be very happy to see order preservation even in the async case, `Publish(...).Wait()` definitely causes huge delays vs the async version. Async as discussed does not preserve the order of items sent (yet)...continued... — Matt, Apr 09 '13 at 22:38
(c) I find it strange that when Publishing without Wait (using async) not all sent items are received. Any idea why that is? I made sure not to suspend the flush in that particular test. => As a result I currently can only use `Publish.Wait()` which is a definite performance drag and would not want me to abandon ZeroMQ. But if order preservation can be achieved in async mode AND all items are received as sent (as mentioned for some reason items are dropped in async mode, no idea why) then this would be terrific. Any time line in mind re order preservation? Thanks a lot in advance for comments — Matt, Apr 09 '13 at 22:41
@Freddy are you sure about that "dropped message" claim? I'm testing here with 500,000 messages: get them all. Do you have a scenario that demo's that? Is there any chance your test does `sub.Subscribe(...);` and then immediately starts publishing, without waiting on the subscribe to complete? (in 1.3, the subscribe returns a task that can be waited/awaited/etc) — Marc Gravell, Apr 10 '13 at 06:07
I use two console apps, one only publishes, the other only subscribes and receives messages. As long as I send async without `Wait()` never the same number of messages arrive as the one sent. When I use `Wait()` the message count is identical but its incredibly slow. — Matt, Apr 10 '13 at 06:28
@Freddy are you sure that isn't just a threading issue? i.e. are you incrementing a counter without a `lock` / `Interlocked`? With regards your other question (when): I have it implemented now - I'm just regression testing — Marc Gravell, Apr 10 '13 at 06:41
let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/27918/discussion-between-freddy-and-marc-gravell) — Matt, Apr 10 '13 at 06:48
Mark solved the issue of order preservation and in fact added the option of switching to order preservation in the raising the subscriber channel handler. Awesome work. Thank you very much!!!!! — Matt, Apr 10 '13 at 07:28

How does PubSub work in BookSleeve/ Redis?

1 Answers1

Linked