1

UPDATE: There is a request about this on the EF Team's UserVoice site. Vote it up

This thread about the Include statement is also related.

Why it's important

Querying with EF6 and Linq rocks! But if it involves a couple of Join Tables, and a nullable Foreign Key, it bogs down with 1000 lines of T-Sql.

If Inner Joins could be forced, it would perform with only 10 lines

For example, an EF6 project references a SQL database. There's a Student table and a Tutor table. Not every Student has a Tutor, so Student.TutorId can be null.

All Student-Tutor info is easily found with T-SQL:

SELECT s.Name, t.Name FROM Student s JOIN Tutor t ON s.TutorId = t.Id

The Linq is like so:

var result = context.Students
.Where(s => s.TutorId != null)
.Select(s => new { StudentName = s.Name, TutorName = s.Tutor.Name })
.ToList();

But EF6 generates this SQL:

SELECT [Extent1].[Name], 
    CASE WHEN ([Extent2].[FirstName] IS NULL)
       THEN N'' 
    ELSE
       [Extent2].[Name] 
    END AS [C1]
FROM  [dbo].[Student] AS [Extent1]
LEFT OUTER JOIN [dbo].[Tutor] AS [Extent2] ON [Extent1].[TutorId] = [Extent2].[Id]
WHERE [Extent1].[TutorId] IS NOT NULL

Thanks to Peter for asking about this many years ago. Hopefully there is a better answer now than to abandon Linq.

This GitHub Repository has source code for experimentation.

  • Could you explain again why is this important? Join tables usually does not have nullable columns, and your example is simple one-to-many optional relationship, so with `INNER` of `LEFT` join, the result should be one and the same. – Ivan Stoev Jun 26 '17 at 10:06
  • Because a Join Table may or may not have a match for a given row, EF generates Outer Joins just like it does for the nullable FK. – Christopher J. Grace Jun 26 '17 at 10:11
  • I suspect Outer Joins are slower than Inner Joins, but that is not the really expensive issue. If there are several Outer Joins in a query, then there is a compounding effect, as EF generates more and more code to deal with each possible combination of null and non null values. That's how the generated T-SQL explodes into 1000 lines. – Christopher J. Grace Jun 26 '17 at 10:13
  • Any more realistic example? It happens only when you "navigate" inside the query from optional part. In the above sample you could easily get `inner join` by starting from the *required end* (`Tutor`), but I still don't see sample usage with link (join, junction) table (which usually you don't have direct access in EF). – Ivan Stoev Jun 26 '17 at 10:17
  • There is an example of that in the GitHub Repo. The manual join approach is generating Inner Joins, so that's fine. I thought it might require a null check, but now it seems not. – Christopher J. Grace Jun 26 '17 at 19:52
  • Has anyone else seen the compounding effect, where all of a sudden, what should be a few simple joins in SQL turns into all these nested sub query "Projects" and "Limits" with Outer Joins and Cross Joins. Ay-ay-ay. I have tried to reproduce that in the GitHub Repo, but already burned several hours building out the test data as it is. This is the biggest barrier to using Linq and EF for heavy data retrieval IMO. Seems like most developers just give up and revert to TSQL. But Linq has so many benefits. – Christopher J. Grace Jun 26 '17 at 20:10

3 Answers3

3

Entity Framework will generate an inner join if you add a not-null condition after the projection:

var result = context.Students
.Select(s => new { StudentName = s.Name, TutorName = s.Tutor.Name })
.Where(x => x.TutorName != null)
.ToList();

I don't know why it works this way. If EF is smart enough to infer that x.TutorName != null amounts to an inner join, I'd think it should be able to the same with s.TutorId != null.

Gert Arnold
  • 105,341
  • 31
  • 202
  • 291
  • This does work, and is more elegant IMO than ditching the navigation property. I wonder if using s.TutorId != null would have worked if the Where clause was after the projection. – Christopher J. Grace Jun 26 '17 at 10:01
  • Between the two of these answers, I am able to obtain T-Sql from EF that is about what I'd write myself. Adding a GroupBy totally destroys it though, particularly when returning individual values from the group and not just aggregates. This seems like closely related expression tree logic, but maybe I should make it a new question. – Christopher J. Grace Jun 26 '17 at 10:08
  • 1
    I can imagine that in that case something along the lines of Ivan's answer will be more reliable. As long as it's not 100% clear when (and why) EF sometimes resolves to an inner join when using navigation properties it'll always a gamble. – Gert Arnold Jun 26 '17 at 12:46
2

The only reliable way is if you can construct the LINQ query in such a way that the relationships are "navigated" from the required end to optional end through SelectMany, which I guess makes it not generally applicable.

For demonstration purposes, if you write the sample query like this

var result = db.Tutors
    .SelectMany(t => t.Students, (t, s) => new { StudentName = s.Name, TutorName = t.Name })
    .ToList();

the generated SQL will be something like this

SELECT
    [Extent1].[Id] AS [Id],
    [Extent2].[Name] AS [Name],
    [Extent1].[Name] AS [Name1]
    FROM  [dbo].[Tutors] AS [Extent1]
    INNER JOIN [dbo].[Students] AS [Extent2] ON [Extent1].[Id] = [Extent2].[TutorId]
Ivan Stoev
  • 195,425
  • 15
  • 312
  • 343
  • 1
    Good point. Querying "from the required end to optional end" sounds like a more solid rule than "filter after the projection". – Gert Arnold Jun 26 '17 at 12:48
  • I experimented with using navigation properties from the required end, but it still resulted in Outer Joins. That was a Select though, not a SelectMany. I wonder why that would make a difference. Very tricky. Some say to just stick with TSQL if there is any complexity. Linq has so many benefits though. – Christopher J. Grace Jun 26 '17 at 19:54
  • Thanks for your thoughtful answer @Ivan. I am using SelectMany in the actual project, and will post how it turns out. – Christopher J. Grace Jun 30 '17 at 20:39
  • Navigating from the required "required end to the optional end" has been working, and has been a helpful technique. Using Select vs. SelectMany does not seem to make much difference. But I will keep playing with it. – Christopher J. Grace Aug 03 '17 at 04:40
  • Join/Bridge tables can also result in unwanted LEFT OUTER JOINS. So can GroupBy statements. The query was taking so much effort to make performant in LINQ, I had little choice but to make it a proc. Even if I could find just the right order, it would be so fragile, future developers would have a difficult time modifying it. Such a shame. – Christopher J. Grace Aug 03 '17 at 05:24
1

At this case you should join tables explicitly, instead of calling tutor as student's property:

var result = (from s in context.Students
             join t in context.Tutors
             on s.TutorId equals t.Id
             select new 
             {
                 StudentName = s.Name, 
                 TutorName = t.Name
             }).ToLost();
Slava Utesinov
  • 13,410
  • 2
  • 19
  • 26