2

I have a query:

var contactInfos = from person in persons

                    join tempDesiredCar in desiredCars on person.contact_id equals tempDesiredCar.groupEntity_id
                                   into tempDesiredCars
                    from desiredCar in tempDesiredCars.DefaultIfEmpty()
                    select new {name = person.name, car = desiredCar.name};

This code translates to SQL:

SELECT [t1].[name], [t19].[name] AS [car]
FROM [dbo].[Person] AS [t1]
CROSS APPLY ((
        SELECT NULL AS [EMPTY]
        ) AS [t6]
    OUTER APPLY (
        SELECT [t18].[name]
        FROM (
            SELECT [t17].[contact_id], [t17].[name]
            FROM (
                SELECT [t7].[contact_id], [t11].[name]
                FROM [dbo].[DesiredCar] AS [t7]
                INNER JOIN (
                    SELECT MAX([t9].[value]) AS [value]
                    FROM (
                        SELECT [t8].[id] AS [value], [t8].[contact_id]
                        FROM [dbo].[DesiredCar] AS [t8]
                        ) AS [t9]
                    GROUP BY [t9].[contact_id]
                    ) AS [t10] ON ([t7].[id]) = [t10].[value]
                LEFT OUTER JOIN [dbo].[FamilyModel] AS [t11] ON [t11].[id] = [t7].[model]
                WHERE [t7].[model] IS NOT NULL
                UNION
                SELECT [t12].[contact_id], [t16].[name]
                FROM [dbo].[DesiredCar] AS [t12]
                INNER JOIN (
                    SELECT MAX([t14].[value]) AS [value]
                    FROM (
                        SELECT [t13].[id] AS [value], [t13].[contact_id]
                        FROM [dbo].[DesiredCar] AS [t13]
                        ) AS [t14]
                    GROUP BY [t14].[contact_id]
                    ) AS [t15] ON ([t12].[id]) = [t15].[value]
                LEFT OUTER JOIN [dbo].[CarBrand] AS [t16] ON [t16].[id] = [t12].[carBrand_id]
                WHERE [t12].[carBrand_id] IS NOT NULL
                ) AS [t17]
            ) AS [t18]
        where [t1].[contact_id] = [t18].[contact_id]
        ) AS [t19])

That code uses Apply. If replace Apply by Left Join code runs faster in a few times. How to force Linq to generate code using the Left Join?

Coder
  • 41
  • 4
  • 1
    LINQ to SQL isn't a "better" way to write SQL nor does it replace SQL. It's easier and faster to write such a simple statement by hand and have L2S create the new objects from the results – Panagiotis Kanavos Dec 28 '12 at 08:27
  • 1
    This solution makes the automatic refactoring not available. – Coder Dec 28 '12 at 08:36
  • 1
    Please show us the full LINQ query... – usr Dec 28 '12 at 12:05
  • Also please post the graphical execution plans. I cannot see a reason why APPLY should result in a different query plan here. Are you sure the transformation to `LEFT JOIN` is semantically identical? – usr Dec 28 '12 at 12:06
  • Plan with Apply http://yadi.sk/d/w9SAcsiu1VDKU – Coder Dec 28 '12 at 12:51
  • Plan with Join http://yadi.sk/d/SbIPPJjf1VDK8. – Coder Dec 28 '12 at 12:51
  • [t18] has many clauses that are not coming from the posted linq statement. Can you post the code that generates those? – Amy B Dec 30 '12 at 04:09

3 Answers3

2

I’ve found the reason why LINQ is using an apply operator. Operands are checking, whether is there dependence from foreign key in that join or there is not. If it's just a table – linq uses "Left Join". LINQ uses "Outer apply" if the subquery contains Group By, Order By with column used in “equals”. Use simple subqueries in LINQ JOIN.

Bogdan Verbenets
  • 25,686
  • 13
  • 66
  • 119
Coder
  • 41
  • 4
  • 3
    I've been reading your answer for 10 minutes and I still don't understand it. I don't understand your sentences and I don't understand what do you mean by "simple subqueries". I'm trying to avoid an "APPLY" operator myself, for different reasons. – migle Sep 29 '14 at 09:55
0

I cannot find a reason why a) join and apply produce different plans and b) LINQ is using an apply at all. This is a strange case.

I tried coming up with workarounds, hope they work:

//project only the needed columns
var contactInfos = from person in persons
                    join tempDesiredCar in
                         desiredCars.Select(x => new { x.groupEntity_id, x.name })
                         on person.contact_id equals tempDesiredCar.groupEntity_id
                         into tempDesiredCars
                    from desiredCar in tempDesiredCars.DefaultIfEmpty()
                    select new { name = person.name, car = desiredCar.name };

//change join to from
var contactInfos = from person in persons
                   from desiredCar in
                         desiredCars.Select(x => new { x.groupEntity_id, x.name })
                                    .Where(x => person.contact_id = x.groupEntity_id)
                                    .DefaultIfEmpty()
                    select new { name = person.name, car = desiredCar.name };

These are just speculative rewritings. I hope they work by coincidence... I know this is not a good way to solve a problem but in this strange case it's the best I can offer.

usr
  • 168,620
  • 35
  • 240
  • 369
0

I inspected the query plans.

The apply plan uses Nested Loops to join the tables together. For each record in the Person table, it reads every record in the DesiredCars table. Do you really need to find the desired car of every person in the database? If not, consider filtering the Person table down a bit first.

The join plan uses Hashmatching to join the tables together. It reads both tables once, producing a map of one table, and using that map on the second table. The optimizer has pointed out a couple of index recommendations. These recommendations may help both plans.

        <MissingIndexes>
          <MissingIndexGroup Impact="12.6549">
            <MissingIndex Database="[AudiCRM]" Schema="[dbo]" Table="[DesiredCar]">
              <ColumnGroup Usage="INEQUALITY">
                <Column Name="[carBrand_id]" ColumnId="13" />
              </ColumnGroup>
              <ColumnGroup Usage="INCLUDE">
                <Column Name="[id]" ColumnId="1" />
                <Column Name="[contact_id]" ColumnId="2" />
              </ColumnGroup>
            </MissingIndex>
          </MissingIndexGroup>
          <MissingIndexGroup Impact="12.6549">
            <MissingIndex Database="[AudiCRM]" Schema="[dbo]" Table="[DesiredCar]">
              <ColumnGroup Usage="INEQUALITY">
                <Column Name="[model]" ColumnId="3" />
              </ColumnGroup>
              <ColumnGroup Usage="INCLUDE">
                <Column Name="[id]" ColumnId="1" />
                <Column Name="[contact_id]" ColumnId="2" />
              </ColumnGroup>
            </MissingIndex>
          </MissingIndexGroup>
        </MissingIndexes>
Amy B
  • 108,202
  • 21
  • 135
  • 185