0

Given that I have a service P (Producer) and C (Consumer) my P service needs to:

  • Create an object X
  • Create an object Y (dependent on X)
  • Create an object Z (dependent on Y)
  • Notify C about X, Y, and Z (via Redis Streams)
  • C needs to use data from Z, Y, and X to do some local data persistence
  • Updated to Y and fairly common, but to X are rare

From C's perspective, is there a way to guarantee that it had all the info it needed for successful persistence?

I know that services like Kafka and Redis Streams are not generally built for this stuff, but how does one overcome this?

Idea 1:

  • Send X, Y, and Z in that particular order to the same consumer group. But if we scale the number of workers to anything above 1, we run into the problem

Idea 2:

  • Instead of sending X and Y separately to C, I can send a compound object Z, which has Y and X embedded. But seems really like overkill - doesn't it?

Is there any obvious way to handle object dependencies?

Jovan Perovic
  • 19,846
  • 5
  • 44
  • 85

2 Answers2

2

This is a good question covered in this note about Redis Streams.

We could say that schematically the following is true:

If you use 1 stream -> 1 consumer, you are processing messages in order.

If you use N streams with N consumers, so that only a given consumer hits a subset of the N streams, you can scale the above model of 1 stream -> 1 consumer.

If you use 1 stream -> N consumers, you are load balancing to N consumers, however in that case, messages about the same logical item may be consumed out of order, because a given consumer may process message 3 faster than another consumer is processing message 4.

So basically Kafka partitions are more similar to using N different Redis keys, while Redis consumer groups are a server-side load balancing system of messages from a given stream to N different consumers.

So if you want to process them in order and use more than one consumer you will need to deal with that yourself.

Mahdi Yusuf
  • 19,931
  • 26
  • 72
  • 101
2

I believe IDEA 2 is a better solution cuz I think keep the whole message in one data structure is a good idea.

And probably you can try to use multiple keys.

For example:

On Service P

def now_timestamp = datetime.currentstamp # let`s say it is 1515151551

redis sadd not_processed_timestamp 1515151551
redis set X_1515151551 INFO_OF_X
redis set Y_1515151551 INFO_OF_Y
redis set Z_1515151551 INFO_OF_Z

On service C, create a new thread

def new_task_timestamp = redis spop not_processed_timestamp # let`s say it is 1515151551
redis blocking-get X_1515151551
redis blocking-get Y_1515151551
redis blocking-get Z_1515151551

# process the rest
Hi computer
  • 946
  • 4
  • 8
  • 19