In-order processing in Azure event hubs with Partitions and multiple "event processor" clients

Question

I plan to utilize all 32 partitions in Azure event hubs. Requirement: "Ordered" processing per partition is critical.. Question: If I increase the TU's (Throughput Units) to max available of 20 across all 32 partitions, I get 40 MB of egress. Let's say I calculated that I need 500 parallel client threads processing in parallel (EventProcessorClient) to achieve my throughput needs. How do I achieve this level of parallelism with EventProcessorClient while honoring my "Ordering" requirement? Btw, In Kafka, I can create 500 partitions in a topic and Kafka allows only 1 thread per partition guaranteeing event order.

Jesse Squire · Answer 1 · 2020-12-07T21:22:59.103

0

In short, you really can't do what you're looking to do in the way that you're describing.

The EventProcessorClient is bound to a given Event Hub and consumer group combination and will collaborate with other processors using the same Event Hub/consumer group to evenly distribute the load. Adding more processors than the number of partitions would result in them being idle. You could work around this by using additional consumer groups, but the EventProcessorClient instances will only coordinate with others in the same consumer group; the processors for each consumer group would act independently and you'd end up processing the same events multiple times.

There are also quotas on the service side that you may not be taking into account. Assuming that you're using the Standard tier, the maximum number of concurrent reads that you could have for one Event Hub, across all partitions, with the standard tier is 100. For a given Event Hub, you can create a maximum of 20 consumer groups. Each consumer group may have a maximum of 5 active readers at a time. The Event Hubs Quotas page discusses these limits. That said, a dedicated instance allows higher limits but you would still have a gap with the strict ordering that you're looking to achieve.

Without knowing more about your specific application scenarios, how long it takes for an event to be processed, the relative size of the event body, and what your throughput target is, its difficult to offer alternative suggestions that may better fit your needs.

edited Dec 07 '20 at 21:22

answered Dec 07 '20 at 21:17

Jesse Squire

6,107
1
27
30

Gracias! 5 questions. 1) If I have 20 consumer groups with 5 concurrent readers, i can have a max of 100 parallel readers in standard tier? 2) Would the "EventProcessorClient" coordinate the "offsets" automatically within each consumergroup guaranteeing ordering within each "offset" only? 3) I am guessing if i have a single reader per partition, ordering will be guaranteed? 4)The only way to get past the 100 concurrent readers is to go to dedicated tier? 5) If 1 to 1 mapping is recommended between each partition and consumer group, no point in having more than 20 partitions in "standard" tier? – teeboy Dec 08 '20 at 00:45
I'm going to break this up into several comments to work around the formatting limitations. 1) Yes – Jesse Squire Dec 08 '20 at 14:12
2) The processor will guarantee ordering of events in a single partition for one Event Hub+Consumer Group pairing. 2a) Event Hubs has an "at-least-once' delivery guarantee; your processing needs to account for potential duplicates which would break your strict ordering. This is most common when processors are scaled. – Jesse Squire Dec 08 '20 at 14:13
4) You should talk to support or your account rep. It is possible to scale beyond 32 partitions, in some cases, with a support request. – Jesse Squire Dec 08 '20 at 14:13
5) I believe so, but I'm only about 80% sure here. I may be mistaken and the partition may be taken into account for the quota. If so, you'd be able to have 5 readers per partition, per consumer group for the Event Hub instance. I'd double-check to be sure that I'm not misleading you here. – Jesse Squire Dec 08 '20 at 14:16
@serkantkaraca: If you're around, would you keep me honest on #5 above? – Jesse Squire Dec 08 '20 at 14:17
Regarding question #2, the ordering guarantee is only if consumer group has a single receiver, right? Based on the above answers, i can conclude that if my message size is 5kb, i can process 8,000 messages per second with 20 TU's(40 MB egress). If each message takes 1 second to process, with 20 receivers across 20 consumer groups would take 400 seconds to process all 8,000! I did not got for 5 receivers per consumer group because of "ordering" guarantees needed. Does the math checkout? – teeboy Dec 09 '20 at 01:29
The above calculation is for "Standard" tier – teeboy Dec 09 '20 at 12:55
I'm going to be a bit pedantic on terminology. The ordering guarantee exists for a single reader of a partition. Every type that reads starts at a given position in the event stream and reads contiguously forward. However, each reads the same stream with no knowledge of other readers. So, if one application is processing events from both readers, it will see duplication and out-of-order events between those two. – Jesse Squire Dec 09 '20 at 14:15
EventProcessorClient and EventProcessor mitigate this by coordinating to ensure there is only a single reader for a partition - for one consumer group. They will not coordinate across consumer groups, so if you have processors working against the same Event Hub but different consumer groups, you will have more than one active reader per partition and see duplication and out-of-order reads. To scale the way that you're describing, the load needs to be spread across more partitions - either in the same Event Hub or another. – Jesse Squire Dec 09 '20 at 14:21
Thanks, @JesseSquire. Obviously, I have a choice to use Azure service bus also for "ordering" with "SessionIds". However, for the number of messages we need to process, Event Hubs is the way to go. If I have to group all "related" messages to a specific partition so as to process them in order (mimicking a "sessionId" from service bus), am I going to lose HA since that specific partition can be down and I have to write fault handling logic to handle this scenario myself? – teeboy Dec 09 '20 at 14:37
Your statement is correct; its uncommon, but it is possible for a partition node to experience an issue that renders it temporarily unavailable. The Event Processor clients are resilient to these failures and would recover without the application having to manage failure cases. You would have to manually handle failures with the other client types. As an aside, since you mentioned Service Bus, most often applications with strict ordering needs such as you're describing are more appropriate for Service Bus since the throughput advantage of Event Hubs is reduced by the sequential processing. – Jesse Squire Dec 09 '20 at 20:48
But, we can still get throughput advantages by going to "dedicated" tier on event hubs as the limits/quotas are very high and would satisfy most scale requirements.. our requirement is more on IOT hub for edge devices.. – teeboy Dec 10 '20 at 21:19

In-order processing in Azure event hubs with Partitions and multiple "event processor" clients

1 Answers1