3

I'm having an application where I map devices from the physical world to Reliable Actors in Azure Fabric. Each time I receive a message from a device, I want to push a message to an event hub.

What I'm doing right now is creating/using/closing the EventHubClient object for each message.

This is very inefficient (it takes about 1500ms) but it solves an issue I had in the past where I was keeping the EventHubClient in memory. When I have a lot of devices, the underlying virtual machine can quickly run out of network connections.

I'm thinking about creating a new actor that would be responsible for pushing data to the EventHub (by keeping the EventHubClient alive). Because of the turned based concurrency model of Reliable Actors, I'm not sure it's a good idea. If I get 10 000 devices pushing data "at the same time", each of their actors will block to push the message to the new actor that pushes message to the EventHub.

What is the recommended approach for this scenario ? Thanks,

japf
  • 6,598
  • 3
  • 33
  • 30

3 Answers3

4

You could use a pub/sub pattern here (use the BrokerService). By decoupling event publishing from event processing, you don't need to worry about the turn based concurrency model.

Publishers:

The Actor sends out messages by simply publishing them to a BrokerService.

Subscribers

Then you use one or more Stateless Services or (different) Actors as subscribers of the events. They would send them into EventHub in their own pace.

Event Hub Client

Using this approach you'd have full control over the EventHubClient instance counts and lifetimes. You could increase event processing power by simply adding more subscribers.

LoekD
  • 11,402
  • 17
  • 27
4

One approach would be to create a stateless service that is responsible for pushing messages to the EventHub. Each time an Actor receives a message from the device (by the way, how are they communicating with actors?) the Actor calls the stateless service. The stateless service in turn would be responsible for creating, maintining and disposing of one EventHubClient per service. Reliable Service would not introduce the same 'overhead' when it comes to handling incoming messages as a Reliable Actor would. If it is important for your application that the messages reach the EventHub in strictly the same order that they were produced in then you would have to do this with a Stateful Service and a Reliable Queue. (Note, this there is on the other hand no guarantee that Actors would be able to finish handling incoming messages in the same order as they are produced)

You could then fine tune-tune the solution by experimenting with the instance count (https://learn.microsoft.com/en-us/azure/service-fabric/service-fabric-availability-services) to make sure you have enough instances to handle the throughput of incoming messages. How many instances are roughly determined by the number of nodes and cores per node, although other factors may also affect.

Simplified architecture, Actors communicating with Services, Service with EventHub Devices communicate with your Actors, the Actors in turn communicate with the Service (may be Stateless or Stateful if you want to queue message, see below), each Service manages an EventHubClient that can push messages to the EventHub.

If your cluster is unable to support an instance count for this service that is high enough (a little simplified: more instances = higher throughput), then you may need to create it as a Stateful Service instead and put messages in a Reliable Queue in the Service and then have the the RunAsync for the Service processing the queue in order. This could take the pressure of peaks in performance.

The Service Fabric Azure-Samples WordCount shows how you work with different Partitions to make the messages from Actors target different instances (or really partitions).

A general tip would be to not try to use Actors for everything (but for the right things they are great and reduces complexity a lot), the Reliable Services model support a lot more scenarios and requirements and could really complement your Actors (rather than trying to make Actors do something they are not really designed for).

yoape
  • 3,285
  • 1
  • 14
  • 27
0

In my opinion you should directly call from your actors the event hub in a background thread with an internal memory queue. You should aggregate messages and use SendBatch to improve performance.

The event hub is able to receive the load by himself.

  • An internal memory queue in the Actor is not reliable in the case where the Actor's service goes down (for whatever reason). In that scenario the batch would be lost. Or do you mean to use Actor's persisted state to store and batch events? – yoape Jan 10 '17 at 15:19
  • actor shall be single threaded, background threads do not make any sense from my point of view. – Holger Leichsenring Apr 12 '17 at 13:27