0

I have a BufferBlock setup like this.

_inputQueue = new BufferBlock<WorkItem>(new DataflowBlockOptions
{
    BoundedCapacity = 1,
    CancellationToken = cancellationToken,
    EnsureOrdered = true
});

Have multiple consumers calling the "FetchWork" function from separate threads

public async Task<WorkItem> GetWork()
{
    WorkItem wi;
    try
    {
        wi = await _inputQueue.ReceiveAsync(new TimeSpan(0, 0, 1));
    }
    catch (Exception)
    {
        //since we supplied a timeout, this will be thrown if no items come back
        return null;
    }
    return wi;
}

Occasionally, the same workitem ends up in multiple consumers! The more the number of workitems in InputQueue", the more the chances of duplicates received in GetWork. My understanding is items fetched via ReceiveAsync are atomic and once an item is read, will not be read again. That's not happening here. I have around 40 parallel consumers calling GetWork.

Panagiotis Kanavos
  • 120,703
  • 13
  • 188
  • 236
teeboy
  • 408
  • 3
  • 13
  • Why not use an ActionBlock or TransformBlock with a DOP of 40? In any case, what you posted can't demonstrate the problem. In fact, with `BoundedCapacity = 1` the 40 workers will keep timing out unless something on the other side pumps more than 40 messages per second – Panagiotis Kanavos Mar 15 '19 at 13:01
  • 1
    How did you determine there are duplicates? How are items *posted* to the BufferBlock? Perhaps the duplicates are just different messages with the same data? Can you post a small sample that actually demonstrates the issue? – Panagiotis Kanavos Mar 15 '19 at 13:02
  • I don't see any timeout issues. they are processing just fine. Just processed 1000 workitems in InputQueue and all of them were fetched. However, some were fetched twice. What advantage does ActionBlock have over BufferBlock to prevent duplicate reads? – teeboy Mar 15 '19 at 13:05
  • One of the Properties of "WorkItem object is Id which is guaranteed unique. I am logging all the "Id's" processed to a log provider along with the threadId. I see the same Id repeated across multiple threads in the log. – teeboy Mar 15 '19 at 13:06
  • You haven't demonstrated any problem yet. You haven't shown any duplicates. The code is highly unusual though - why a 1 second timeout? Why 40 readers instead of an ActionBlock with 40 tasks? Why return *null* if a block times out? – Panagiotis Kanavos Mar 15 '19 at 13:06
  • Dataflow blocks are used to build processing pipelines, not as awaitable ConcurrentQueue implementations. A timeout error means something went seriously wrong and the pipeline has to be torn down. – Panagiotis Kanavos Mar 15 '19 at 13:08
  • BTW I use dataflow pipelines to download and process thousands of air tickets every 15 minutes. If there were duplicates I'd have noticed. – Panagiotis Kanavos Mar 15 '19 at 13:09
  • This is part of a service fabric application. My consumers are part of a stateful distributed service distributed across multiple nodes. They call the "GetWork" function of the "Producer" service on a tcp remoting channel. The producer service is a single instance service in service fabric. I have 40 threads configured from all the "Consumer" service instances in total to GetWork(). – teeboy Mar 15 '19 at 13:09
  • 1
    You just added 100 times extra complexity unrelated to BufferBlock. BufferBlock doesn't deliver those messages to the workers, Service Fabric does. If Service Fabric delivers the same message to multiple workers, or the same message arrives multiple times, BufferBlock can't do anything about it. In fact, why use a *bufferblock* when each worker runs on its own thread? Even a Queue would do. – Panagiotis Kanavos Mar 15 '19 at 13:14
  • I need a queue where I can atomically deque and serve workitems to be processed across all the "Consumers". These consumers happen to live in stateful services in service fabric. so, I have InputQueue servicing 40 or so parallel calls coming into GetWork – teeboy Mar 15 '19 at 13:18
  • 1
    In distributed systems multiple delivery is always an option, unless extra mechanisms are added to handle duplicates. Perhaps an acknowledgement was missed and the infrastrucure resent the message. Cloud services don't have transactional reception either and work with leases. Perhaps you forgot to acknowledge receipt of the message causing it to reappear after its lease run out. – Panagiotis Kanavos Mar 15 '19 at 13:19
  • 1
    if you can't write a *simple* program that demonstrates the issue, look to the rest of the code. In any case, you haven't posted the relevant code. Service Fabric doesn't use BufferBlock. Somehow, somewhere, your own code posts stuff to it. All cloud-based message passing services use leasing instead of transactional reads which means that if you forget to acknowledge receipt of a message, it will reappear in the input queue – Panagiotis Kanavos Mar 15 '19 at 13:21
  • Service fabric communication is not message passing. It's a simple rpc style call via a "RemoteProxy" object [aka. wcf remoting object]. I will add some logging immediately after the "_inputQueue.ReceiveAsync" call to see if it's giving duplicates. – teeboy Mar 15 '19 at 13:24
  • the things called WCF Remoting in the Azure SDK aren't what Remoting refers to outside Azure. Even so, the operations *are* asynchronous. With a `BoundedCapacity` of one though, calling `SendAsync` on the BufferBlock would be equivalent to just processing the message in the operation itself. – Panagiotis Kanavos Mar 15 '19 at 13:53
  • in any case, a question about *BufferBlock* needs nothing more than a console application with one task/thread writing to it an workers reading from it. Can you replicate the issue this way? – Panagiotis Kanavos Mar 15 '19 at 13:54

1 Answers1

1

This seems to be a service fabric issue. The BufferBlock is dequeing the item only once. The producers [service fabric stateful service instances with partition count of 5] are receiving the same item twice in different partitions. Have to investigate this.

teeboy
  • 408
  • 3
  • 13