4

I have an async method, that should look up database entries. It filters by name, thus is a candiate for parallel execution.

However, I can not find a simple way to support both parallel execution and asynchronous tasks.

Here's what I have:

private async Task<List<Item>> GetMatchingItems(string name) {
    using (var entities = new Entities()) {
        var items =  from item in entities.Item.AsParallel()
               where item.Name.Contains(name)
               select item;
        return await items.ToListAsync(); //complains "ParallelQuery<Item> does not contain a definition for ToListAsync..."
    }
}

When I remove AsParallel() it will compile. A I not supposed to use both features at the same time? Or do I understand something wrong?

IHMO, both make sense:

  • AsParallel() would indicate that the database query may get split up into several sub-queries running at the same time, because the individual matching of any item is not dependent on any other item. UPDATE: Bad idea in this example, see comments and answer!
  • ToListAsync() would support the asynchronous execution of this method, to allow other match methods (on other data) to start executing immediately.

How to use both parallel exectuion (with LINQ) and asynchronous tasks at the same time?

Marcel
  • 15,039
  • 20
  • 92
  • 150
  • 3
    Why do you want to use `.AsParallel()` on a database query? Do you think it is faster? It will crash faster if that's what you want... Please go read Stephen Cleary's blog... – Aron May 15 '15 at 14:38
  • I guess you mean this one: http://blog.stephencleary.com/2014/04/a-tour-of-task-part-0-overview.html – Marcel May 18 '15 at 06:40
  • 1
    the issue is that you are trying to get EF to optimise your query with parallism. That is the job of the SQL server query plan generator. Splitting the query up actually has a negative impact in this case. – Aron May 18 '15 at 10:55
  • @Aron I totally agree. The chosen example with the Entities-To-SQL is really bad. I'll leave the question as is, however, for future reference. – Marcel May 18 '15 at 11:05
  • Do right things in right way: If you need access multiple dbcontext at the same time, you can parallel the list of dbcontext, not parallel on dbcontext. – Mr. Squirrel.Downy Nov 04 '21 at 08:47

2 Answers2

5

You can't, and shouldn't. PLINQ isn't for database access. The database knows best on how to parallelize the query and does that just fine on it's own using normal LINQ. PLINQ is for accessing objects and such where you are doing computationally expensive calculations within the LINQ query so it can parallelize it on the client side (vs parallelizing it on the server/database server side).

A better answer might be: PLINQ is for distributing a query that is compute intensive across multiple threads. Async is for returning the thread back so that others can use it because you are going to be waiting on an external resource (database, Disk I/O, network I/O, etc).

Async PLINQ doesn't have a very strong use case where you want to return the thread while you wait AND you have a lot of calculations to do... If you are busy calculating, you NEED the thread (or multiple threads). They are almost completely on different ends of optimization. If you need something like this, there are much better mechanisms like Tasks, etc.

Robert McKee
  • 21,305
  • 1
  • 43
  • 57
  • 2
    TPL Dataflow is a good library if someone really does need parallel *and* asynchronous capabilities. In this case, PLINQ is misused (misunderstood, I think) by the op. – Stephen Cleary May 15 '15 at 20:33
1

Well, you can't. These are 2 different options that don't go together.

You can use Task.Run with async-await to parallelize the synchronous part of the asynchronous operation (i.e. what comes before the first await) on multiple ThreadPool threads, but without all the partition logic built into PLINQ, for example:

var tasks = enumerable.Select(item => Task.Run(async () => 
{
    LongSynchronousPart(item);
    await AsynchronouPartAsync(item);
}));
await Task.WhenAll(tasks);

In your case however (assuming it's Entity Framework) there's no value in using PLINQ as there's no actual work to parallelize. The query itself is executed in the DB.

i3arnon
  • 113,022
  • 33
  • 324
  • 344