136

I'm working on some some Web API stuff using Entity Framework 6 and one of my controller methods is a "Get All" that expects to receive the contents of a table from my database as IQueryable<Entity>. In my repository I'm wondering if there is any advantageous reason to do this asynchronously as I'm new to using EF with async.

Basically it boils down to

 public async Task<IQueryable<URL>> GetAllUrlsAsync()
 {
    var urls = await context.Urls.ToListAsync();
    return urls.AsQueryable();
 }

vs

 public IQueryable<URL> GetAllUrls()
 {
    return context.Urls.AsQueryable();
 }

Will the async version actually yield performance benefits here or am I incurring unnecessary overhead by projecting to a List first (using async mind you) and THEN going to IQueryable?

Jesse Carter
  • 20,062
  • 7
  • 64
  • 101
  • 2
    context.Urls is of type DbSet which implements IQueryable so the .AsQueryable() is redundant. https://msdn.microsoft.com/en-us/library/gg696460(v=vs.113).aspx Assuming you've followed the patterns that EF provides, or used the tooling that creates the context for you. – Sean B Nov 14 '16 at 16:53

3 Answers3

290

The problem seems to be that you have misunderstood how async/await work with Entity Framework.

About Entity Framework

So, let's look at this code:

public IQueryable<URL> GetAllUrls()
{
    return context.Urls.AsQueryable();
}

and example of it usage:

repo.GetAllUrls().Where(u => <condition>).Take(10).ToList()

What happens there?

  1. We are getting IQueryable object (not accessing database yet) using repo.GetAllUrls()
  2. We create a new IQueryable object with specified condition using .Where(u => <condition>
  3. We create a new IQueryable object with specified paging limit using .Take(10)
  4. We retrieve results from database using .ToList(). Our IQueryable object is compiled to sql (like select top 10 * from Urls where <condition>). And database can use indexes, sql server send you only 10 objects from your database (not all billion urls stored in database)

Okay, let's look at first code:

public async Task<IQueryable<URL>> GetAllUrlsAsync()
{
    var urls = await context.Urls.ToListAsync();
    return urls.AsQueryable();
}

With the same example of usage we got:

  1. We are loading in memory all billion urls stored in your database using await context.Urls.ToListAsync();.
  2. We got memory overflow. Right way to kill your server

About async/await

Why async/await is preferred to use? Let's look at this code:

var stuff1 = repo.GetStuff1ForUser(userId);
var stuff2 = repo.GetStuff2ForUser(userId);
return View(new Model(stuff1, stuff2));

What happens here?

  1. Starting on line 1 var stuff1 = ...
  2. We send request to sql server that we want to get some stuff1 for userId
  3. We wait (current thread is blocked)
  4. We wait (current thread is blocked)
  5. .....
  6. Sql server send to us response
  7. We move to line 2 var stuff2 = ...
  8. We send request to sql server that we want to get some stuff2 for userId
  9. We wait (current thread is blocked)
  10. And again
  11. .....
  12. Sql server send to us response
  13. We render view

So let's look to an async version of it:

var stuff1Task = repo.GetStuff1ForUserAsync(userId);
var stuff2Task = repo.GetStuff2ForUserAsync(userId);
await Task.WhenAll(stuff1Task, stuff2Task);
return View(new Model(stuff1Task.Result, stuff2Task.Result));

What happens here?

  1. We send request to sql server to get stuff1 (line 1)
  2. We send request to sql server to get stuff2 (line 2)
  3. We wait for responses from sql server, but current thread isn't blocked, he can handle queries from another users
  4. We render view

Right way to do it

So good code here:

using System.Data.Entity;

public IQueryable<URL> GetAllUrls()
{
   return context.Urls.AsQueryable();
}

public async Task<List<URL>> GetAllUrlsByUser(int userId) {
   return await GetAllUrls().Where(u => u.User.Id == userId).ToListAsync();
}

Note, than you must add using System.Data.Entity in order to use method ToListAsync() for IQueryable.

Note, that if you don't need filtering and paging and stuff, you don't need to work with IQueryable. You can just use await context.Urls.ToListAsync() and work with materialized List<Url>.

Corical
  • 15
  • 5
Viktor Lova
  • 4,776
  • 2
  • 19
  • 25
  • `await Task.WhenAll(stuff1Task, stuff2Task);` would require the `async` keyword in the Action method signature. Does that in itself have any impact on how IIS handles the request? – Korijn Oct 31 '14 at 14:38
  • @Korjin I don't know how is implemented IIS, but to work efficiently with async/await methods IIS must work in asynchronous way too. I can't say you more. – Viktor Lova Oct 31 '14 at 14:51
  • 3
    @Korijn looking at picture http://i2.iis.net/media/7188126/introduction-to-iis-architecture-101-HTTPRequestWorkerProc.png?cdn_id=2012-08-15-001 from [Introduction to IIS architecture](http://www.iis.net/learn/get-started/introduction-to-iis/introduction-to-iis-architecture) I can say that all requests in IIS are processed in asynchronous way – Viktor Lova Oct 31 '14 at 14:53
  • anyway, IIS doesn't knows about async/await in code of hosted application – Viktor Lova Oct 31 '14 at 15:01
  • 8
    Since you are not acting on the result set in the `GetAllUrlsByUser` method, you don't need to make it async. Just return the Task and save your self an unnecessary state machine from being generated by the compiler. – Johnathon Sullinger Feb 13 '16 at 06:38
  • 1
    @JohnathonSullinger Although that would work in happy flow, doesn't that have the side-effect that any exception will not surface here and propagate to the first place that has an await? (Not that thats neccesarily bad, but it is a change in behavior?) – Henry Been May 22 '16 at 10:42
  • Yeah, if you want to handle exceptions in this spot, and not further up the stack, then you would want to await here. It depends on your exception handling. If you're just going to write a log entry, the let it happen further up if you are doing recovery, then await and recover etc – Johnathon Sullinger May 22 '16 at 15:31
  • 13
    Interesting no one notice that the 2nd code example in "About async/await" is total non-sense, cause it would throw an exception since neither EF nor EF Core are thread safe, so trying to run in in parallel will just throw an exception – Tseng Aug 01 '17 at 16:09
  • How would you await, make it async a JOIN LINQ code? ex: queryable1.Join(queryable2, x => x.RefId, y => y.ID, (x, y) => new { x, y }).ToList(); – Duduman Bogdan Vlad Jan 19 '18 at 05:34
  • 1
    Although this answer is correct, I'd recommend to avoid using `async` and `await` if you are NOT doing anything with the list. Let the caller to `await` it. When you await the call at this stage `return await GetAllUrls().Where(u => u.User.Id == userId).ToListAsync();` you're creating an extra async wrapper when you decompile the assembly and look at the IL. – Ali Khakpouri Aug 21 '19 at 21:00
16

There is a massive difference in the example you have posted, the first version:

var urls = await context.Urls.ToListAsync();

This is bad, it basically does select * from table, returns all results into memory and then applies the where against that in memory collection rather than doing select * from table where... against the database.

The second method will not actually hit the database until a query is applied to the IQueryable (probably via a linq .Where().Select() style operation which will only return the db values which match the query.

If your examples were comparable, the async version will usually be slightly slower per request as there is more overhead in the state machine which the compiler generates to allow the async functionality.

However the major difference (and benefit) is that the async version allows more concurrent requests as it doesn't block the processing thread whilst it is waiting for IO to complete (db query, file access, web request etc).

Trevor Pilley
  • 16,156
  • 5
  • 44
  • 60
  • 7
    until a query is applied to IQueryable.... neither IQueryable.Where and IQueryable.Select force the query to execute. The prior applies a predicate and the latter applies a projection. It's not executed until a materializing operator is used, like ToList, ToArray, Single or First. – JJS Nov 22 '16 at 13:30
3

Long story short,
IQueryable is designed to postpone RUN process and firstly build the expression in conjunction with other IQueryable expressions, and then interprets and runs the expression as a whole.
But ToList() method (or a few sort of methods like that), are ment to run the expression instantly "as is".
Your first method (GetAllUrlsAsync), will run imediately, because it is IQueryable followed by ToListAsync() method. hence it runs instantly (asynchronous), and returns a bunch of IEnumerables.
Meanwhile your second method (GetAllUrls), won't get run. Instead, it returns an expression and CALLER of this method is responsible to run the expression.

Rzassar
  • 2,117
  • 1
  • 33
  • 55