Order of processing different Redis Streams messages

Question

Given that I have a service P (Producer) and C (Consumer) my P service needs to:

Create an object X
Create an object Y (dependent on X)
Create an object Z (dependent on Y)
Notify C about X, Y, and Z (via Redis Streams)
C needs to use data from Z, Y, and X to do some local data persistence
Updated to Y and fairly common, but to X are rare

From C's perspective, is there a way to guarantee that it had all the info it needed for successful persistence?

I know that services like Kafka and Redis Streams are not generally built for this stuff, but how does one overcome this?

Idea 1:

Send X, Y, and Z in that particular order to the same consumer group. But if we scale the number of workers to anything above 1, we run into the problem

Idea 2:

Instead of sending X and Y separately to C, I can send a compound object Z, which has Y and X embedded. But seems really like overkill - doesn't it?

Is there any obvious way to handle object dependencies?

score 2 · Answer 1 · answered Jul 13 '22 at 01:15

This is a good question covered in this note about Redis Streams.

We could say that schematically the following is true:

If you use 1 stream -> 1 consumer, you are processing messages in order.

If you use N streams with N consumers, so that only a given consumer hits a subset of the N streams, you can scale the above model of 1 stream -> 1 consumer.

If you use 1 stream -> N consumers, you are load balancing to N consumers, however in that case, messages about the same logical item may be consumed out of order, because a given consumer may process message 3 faster than another consumer is processing message 4.

So basically Kafka partitions are more similar to using N different Redis keys, while Redis consumer groups are a server-side load balancing system of messages from a given stream to N different consumers.

So if you want to process them in order and use more than one consumer you will need to deal with that yourself.

score 2 · Accepted Answer · answered Jul 13 '22 at 02:00

I believe IDEA 2 is a better solution cuz I think keep the whole message in one data structure is a good idea.

And probably you can try to use multiple keys.

For example:

On Service P

def now_timestamp = datetime.currentstamp # let`s say it is 1515151551

redis sadd not_processed_timestamp 1515151551
redis set X_1515151551 INFO_OF_X
redis set Y_1515151551 INFO_OF_Y
redis set Z_1515151551 INFO_OF_Z

On service C, create a new thread

def new_task_timestamp = redis spop not_processed_timestamp # let`s say it is 1515151551
redis blocking-get X_1515151551
redis blocking-get Y_1515151551
redis blocking-get Z_1515151551

# process the rest

Order of processing different Redis Streams messages

2 Answers2

On Service P

On service C, create a new thread