4

I have the following scenario:

  1. a variable number ( greater than three ) of queues (depends on a configuration set in a file)
  2. some of these queues can either be fed with data or not (it depends on the producer that receives data through a network client: the client can be either connected or not during the same session)
  3. These queues are fed at different speeds; so, for example, Queue1 can have 10 objects at a given time whereas another queue Queue2 can have just 3 objects at the same given time
  4. the objects in these queues must be synchronized according to a property that is shared by all of them (an int property constantly increasing named "SSId")
  5. the synchronization must happen only for the queues that at a given moment are fed with data (unconnected queues have to be excluded)
  6. when the objects are synchronized they must be pushed to a corresponding output queue used by the related consumer: each producer is associated to a specific consumer
  7. following the previous step each consumer is able to process the enqueued object with the same property value for "SSId" at the same time;
  8. So, the final outcome should be a system where the consumers are able to process data (syncronized according to the already mentioned "SSId" property) at the same rate even when each producer generate it at different speeds/rates

To give a clearer idea there is a schema representing the flow described in the previous points: dataflow mesh

Note that the new items with SSid greater than 100 are not pushed on the consumer queues as there are no corrisponding items in the other queues yet.

Could you suggest an approach for creating this kind of synchronization using either .NET TPL Dataflow or Rx.NET? Until now I've used TPL Dataflow for implementing simple sequential pipelines and I'd like a feedback on how to proceed with this scenario. Thanks in advance for any suggestion.

Alex
  • 65
  • 4
  • 1
    What triggers the synchronize operation? How do you detect a not-connected producer? – Shlomo Aug 05 '19 at 14:43
  • The synchronize operation is intended to happen since the producers start filling the queues: whenever all the connected queues are filled each one with an item with the same SSId then those items should be sent to the consumers queues. The non connected producer is detected through an event raised every time the producer get disconnected (or connected again) – Alex Aug 05 '19 at 14:50
  • Is it a given that the IDs are always increasing? – Theodor Zoulias Aug 13 '19 at 23:24
  • yes, in our case IDs are definded as long and constantly increasing by one: 1, 2, 3, ... – Alex Oct 29 '19 at 07:24

1 Answers1

2

How about

  1. Merging objects from all producers into one observable
  2. Grouping the objects by SSId
  3. Emit the group, when the group size equals the count of producers (by .Buffer())

Like this:

var syncedProducers = 
    // ConnectedProducersEvent ticks an array of connected producers, each time a producer connects or disconnects
    ConnectedProducersEvent
        .SelectMany(producers => 
            Observable
                .Merge(producers) // Put all objects, from all producers into the same observable
                .GroupBy(@object => @object.SSId) // Group objects by matching SSId
            .SelectMany(group => group.Buffer(producers.Length))); // Syncing: Emit the SSId group, when the group count matches the count of connected producers

// Now you can wire syncedProducers to consumers
var consumer1 = 
    syncedProducers
        .Select(x => x.Where(y => y.Producer == 1));

You can run the example on dotnetfiddle

Magnus
  • 353
  • 3
  • 8