What do I specify as the Dop parameter for ForEachAsync extension method?

Question

I recently discovered the following code below to effectively run lots of I/O bound tasks:
Implementing a simple ForEachAsync, part 2

I'm under the impression the following are true:

This is much better than using Parallel.ForEach because the work is not CPU bound.
ForEachAsync will help in queueing as many IO tasks as possible (without necessarily putting these on separate threads).
The TPL will 'know' these are IO based tasks and not spin up more threads, instead using callbacks/task completion source to signal back to the main thread, thus saving overhead of thread context switching.

My question is, as Parallel.ForEach intrinsically has its own MaxDegreeOfParallelism defined how do I know what to define the dop parameter to here in the example code of the IEnumerable extension?

e.g. If I have 1000 items to process and need to carry out an IO based SQL-Server db call for each item, would I specify 1000 as the dop? With Parallel.ForEach it is used as a limiter to prevent too many threads spinning up which might hurt performance. But here it seems to be used to partition up the minimum number of async tasks. I'm thinking there should be at least no maximum as such (the minimum being the total items to process) because I want to queue as many IO based calls to the database as possible.

How do I go about knowing what to see the DOP parameter too?

public static Task ForEachAsync<T>(this IEnumerable<T> source, int dop, Func<T, Task> body) 
{ 
    return Task.WhenAll( 
        from partition in Partitioner.Create(source).GetPartitions(dop) 
        select Task.Run(async delegate { 
            using (partition) 
                while (partition.MoveNext()) 
                    await body(partition.Current); 
        })); 
}

usr · Accepted Answer · 2015-10-25T22:15:43.150

Parallel.ForEach intrinsically has its own MaxDegreeOfParallelism

OK, the heuristics built into Parallel.ForEach are very prone to spawn huge numbers of tasks over time (if your work items have a 10ms delay you get hundreds of tasks after an hour or so - I measured it). Really terrible design flaw, don't try to emulate this.

When running IO in parallel there is no substitute for empirically determining the right value. That's why the TPL is so bad at it. For example a magnetic disks doing sequential IO likes a DOP of 1. An SSD doing random likes basically infinite (100?).

A remote web-service gives you no way of knowing the right DOP. Not only do you need to test, you need to ask the owner for permission to spam the service with requests which might overload it.

would I specify 1000 as the dop?

Then you would not need this facility at all. Just spawn all tasks, then wait for all of them. But 1000 is likely the wrong DOP because it overwhelms the DB for no benefit.

here it seems to be used to partition up the minimum number of async tasks

Another terrible feature of Parallel.For. On low CPU machines it might spawn to little tasks. Horrible API. Do not use it with IO. (I use AsParallel which allows you to set an exact DOP, not a max DOP.)

because I want to queue as many IO based calls to the database as possible

Why is that? Not a good plan.

Btw, the method that you posted here is good and I use this as well. I wish it was in the framework. This exact method is the answer to about 10 SO questions per week ("How can I asynchronously process 100000 items in parallel?").

What do I specify as the Dop parameter for ForEachAsync extension method?

1 Answers1