0

Given the following scenario:

I have a system that creates, updates and deletes records. For each of these actions I need to do something (lets say write the events to a log as a silly example) however I need to process these events for each record in order - Meaning I can't log the delete before I have done the create, or any of the previous updates. I also can't log the update before I have logged the create.

I am investigating Queues in order to preserve sequence. However I don't really want RecordID_2 to be held up behind RecordID_14 The records do not need to be processed in sequence as much as the actions on each record have to. Hence I don't think I can/should use one queue.

As I don't have hundreds of RecordID_XX active at the same time, I was thinking of having a queue for each RecordID_XX so if several updates can in for that one RecordID each event for that record would be added to that same queue and be processed in order (I.e. Create first, Update_1 after Create is complete, Update_2 is processed after Update_1 is complete etc), however if additional events for a different record came in they would be added to their own queue. If the queue is empty for a period of time it simply gets deleted. I realize that this may result in a queue getting one message and then being deleted as there were no updates before the idle timeout expired. (This does not seem at all efficient)

Based on Andres T Finnell's excellent answer to this question.

I was thinking of doing the following

Producer (Web Service) -> Queue_Original <- Dispatcher  -> RecordID_14 
                                                        -> RecordID_2 
                                                        -> RecordID_8
                                                        -> RecordID_15

Some of the "logging" may take long. So I want to be able to have a few consumers listening for these queues.

Lets say I have Consumer_1 and Consumer_2 (I may want to add Consumer_3 later to assist with growing load)

What I would like is Consumer_1 to do a getDistinations() where the broker will return [RecordID_14, RecordID_2, RecordID_8, RecordID_15]

Questions:

  • Is it possible for Consumer_1 to iterate through the list of queues returned from the broker looking for the first available queue that does not have a Consumer_X connected to it and begin processing the 1st message on this queue?
  • And then each subsequent Consumer to do the same until it finds the next queue without a Consumer connected to it?
  • Would Advisory-Messages be the thing to use here?

  • Am I going down the wrong path completely? Is there a better approach to handling this scenario?

AcidHawk
  • 496
  • 7
  • 18
  • Any JMS is a wrong way to go, trust me. Have a look at Apache Kafka, or similar. Queues are much more flexible there. – yegodm Mar 22 '18 at 21:16
  • 1
    yep AMQ Advisory Messages with Message Groups can resolve your problems http://activemq.apache.org/message-groups.html , Spring DefaultMessageListenerContainer to create concurrentConsumers https://docs.spring.io/spring/docs/current/javadoc-api/org/springframework/jms/listener/DefaultMessageListenerContainer.html#setMaxConcurrentConsumers-int- – Hassen Bennour Mar 23 '18 at 07:49
  • @yegodm - So looking at Kafka ... still struggling to fit it into my requirement. Do I have a topic for all messages and have a Dispatcher (my term) add events for each RecordID into separate partitions (similar to how i described queues above? However you can't delete partitions. I don't want to have orphaned partitions left over after I have finished processing all events for RecordID_14, I also don't want Consumer_2 to start processing an update for a Create event that Consumer_1 is still busy with. So one partition with multiple consumers is also out. – AcidHawk Mar 23 '18 at 13:04
  • 1
    Firstly, I am pretty sure that dynamically creating/removing queues is not a good idea at all as these are most likely quite expensive operations. Secondly, as you stated there will not be hundreds of records active at a time. You can permanently allocate enough partitions to provide the best degree of parallelism without any need to remove orphaned ones. You just need to route events into partitions in a stable manner so that RecordID_N always goes to the partition #X. – yegodm Mar 23 '18 at 14:30
  • @yegodn - Thanks for you comments I appreciate your assistance. So the only idea I can come up with is possibly having 10 partitions. Then each RecordID_XX that ends in a different int 0 - 9 goes to one of 10 partitions. E.g. All events for RecordID_00, 10, 20 got to PartitionA, All events for RecordID_01,11,21 all go to PartitionB and all events for RecordID_02,12,22 go to PartitionC etc ... With one consumer per Partition ... Some parallelism but synchronous processing preserved? – AcidHawk Mar 23 '18 at 14:55
  • If you associate a single consumer with partition, then yes, publishing ordering is kept. – yegodm Mar 23 '18 at 15:22

0 Answers0