1

I have a situation where I have some DB queries that need to be fired off. Some of these require the results of the previous query to run, which forms a hierarchy of sorts.

I want it so that when 1 query finishes, I can process that data without waiting for the children queries to finish. It's also possible that a hierarchy will stop early if the query returned no data.

I don't need/want the end result as a hierarchy, so having the results flattened is the goal (Similar to how I have a List<IEnumerable> in the example).

I have changed some things for simplicity, but general gist of what I have currently is:

private static async Task ProcessQueries(Foo initialFoo)
{
    var allEntities = new List<IEnumerable<object>>();
    var runningProcesses = new List<Task<Bar>>();
    var queriesToProcess = new Queue<Foo>();
    queriesToProcess.Enqueue(initialFoo);

    while (runningProcesses.Count > 0 || queriesToProcess.Count > 0)
    {
        while (queriesToProcess.Count > 0)
        {
            var fooToProcess = queriesToProcess.Dequeue();
            runningProcesses.Add(ProcessFoo(fooToProcess));
        }

        // As soon as any of the 'ProcessFoo' have finished, it means the query for the entities has finished
        // And it might have more Foo's for us to process.
        // I want to call ProcessFoo as soon as possible as that actually starts the query.
        var finishedProcess = await Task.WhenAny(runningProcesses).ConfigureAwait(false);
        runningProcesses.Remove(finishedProcess);

        var finishedFoo = finishedProcess.Result;
        foreach (var childFoo in finishedFoo.ChildFoos)
        {
            queriesToProcess.Enqueue(childFoo);
        }

        // I can now pass the entities to something else to process it without stopping the loop
        // In this example I just add it to the list
        allEntities.Add(finishedFoo.Entities);
    }
}

Foo just contains all the information needed to start off the query.

Bar contains the entities, plus a list of additional Foo's to run.


I read about potential issues with Task.WhenAny if it grows too large. I don't think his solution applies to me here because my list of tasks is not a fixed size.

I'm wondering if there a more efficient way to accomplish this? Or if there are any existing designs for this kind of work?

Lolop
  • 514
  • 2
  • 9
  • Your question is too broad and opinion based to be suitable here. That said, you should be able to compose your queries as a hierarchy of calls to async methods, so that the code naturally represents the hierarchy you want to achieve. – Peter Duniho Sep 27 '20 at 02:59
  • @PeterDuniho Where would it be suitable? If I changed my question to 'Is there a more efficient way to accomplish this?' or 'Are there existing designs for this kind of situation' does that make it not opinion based anymore? That's mostly what I'm looking for when I asked 'Is this a good way to accomplish this'. – Lolop Sep 27 '20 at 03:23
  • A great library for solving this kind of problems efficiently is the [TPL Dataflow](https://docs.microsoft.com/en-us/dotnet/standard/parallel-programming/dataflow-task-parallel-library). It consists of composable "blocks" that apply a transformation to their input flow, and produce an output flow. There are many types of built-in blocks, but none that can solve this recursion/hierarchy problem out of the box. You can find some custom solutions here: [How to mark a TPL dataflow cycle to complete?](https://stackoverflow.com/questions/26130168/how-to-mark-a-tpl-dataflow-cycle-to-complete) – Theodor Zoulias Sep 27 '20 at 05:46
  • Questions like "is there a more efficient way" or "are there existing designs" are IMHO too broad. It's important that when one writes an answer to a question, one can have confidence that the answer that is written is _the_ answer to the question. Open-ended design questions _may_ be more suitable on https://softwareengineering.stackexchange.com. – Peter Duniho Sep 27 '20 at 06:52

0 Answers0