0

I'm having difficulty conceptualising a requirement I have into something that will fit into our nascent SOA/EDA

We have a component I'll call the Data Downloader. This is a facade for an external data provider that has both high latency and a cost associated with every request. I want to take this component and create a re-usable service out of it with a clear contract definition. It is up to me to decide how that contract should work, however its responsibilities are two-fold:

  1. Maintain the parameter list (called a Download Definition) for an upcoming scheduled download
  2. Manage the technical details of the communication to the external service

Basically, it manages the 'how' of the communication. The 'what' and the 'when' are the responsibilities of two other components:

  • The 'what' is managed by 'Clients' who are responsible for determining the parameters for the download.

  • The 'when' is managed by a dedicated scheduling component. Because of the cost associated with the downloads we'd like to batch the requests intraday.

Hopefully this sequence diagram explains the responsibilities of the services:

Because each of the responsibilities are split out in three different components, we get all sorts of potential race conditions with async messaging. For instance when the Scheduler tells the Downloader to do its work, because the 'Append to Download Definition' command is asynchronous, there is no guarantee that the pending requests from Client A have actually been serviced. But this all screams high-coupling to me; why should the Scheduler necessarily know about any 'prerequisite' client requests that need to have been actioned before it can invoke a download?

Some potential solutions we've toyed with:

  • Make the 'Append to Download Definition' command a blocking request/response operation. But this then breaks the perf. and scalability benefits of having an EDA

  • Build something in the Downloader to ensure that it only runs when there are no pending commands in its incoming request queue. But that then introduces a dependency on the underlying messaging infrastructure which I don't like either.

Makes me think I'm thinking about this problem in a completely backward way. Or is this just a classic case of someone trying to fit a synchronous RPC requirement into an async event-driven architecture?

Community
  • 1
  • 1
Nick Head
  • 26
  • 4

1 Answers1

0

The thing I like most about EDA and SOA, is that it almost completely eliminates the notion of race condition. As long as your events are associated with some association key (e.g. downloadId), the problem you describe can be addressed with several solutions of different complexities - depending on your needs. I'm not sure I totally understand the described use-case but I will try my best

Out of the top of my head:

  1. DataDownloader maintains a list of received Download Definitions and a list of triggered downloads. When a definition is received it is checked against the triggers list to see if the associated download has already been triggered, and if it was, execute the download. When a TriggerDownloadCommand is recieved, the definitions list is checked against a definition with the associated downloadId.
    1. For more complex situation, consider using the Saga pattern, which is implemented by some 3rd party messaging infrastructures. With some simple configuration, it will handle both messages, and initiate the actual download when the required condition is satisfied. This is more appropriate for distributed systems, where an in-memory collection is out of the question.
    2. You can also configure your scheduler (or the trigger command handler) to retry when an error is signaled (e.g. by an exception), in order to avoid that race condition, and ultimately give up after a specified timeout.

Does this help?

moranlf
  • 554
  • 5
  • 19