1

Using C# and Linq to SQL, I found that my query with multiple where is orders of magnitude slower than with a single where / and.

Here is the query

using (TeradiodeDataContext dc = new TeradiodeDataContext())
{
    var filterPartNumberID = 71;
    var diodeIDsInBlades = (from bd in dc.BladeDiodes
                            select bd.DiodeID.Value).Distinct();
    var diodesWithTestData = (from t in dc.Tests
                              join tt in dc.TestTypes on t.TestTypeID equals tt.ID
                              where tt.DevicePartNumberID == filterPartNumberID
                              select t.DeviceID.Value).Distinct();
    var result = (from d in dc.Diodes
                  where d.DevicePartNumberID == filterPartNumberID
                  where diodesWithTestData.Contains(d.ID)
                  where !diodeIDsInBlades.Contains(d.ID)
                  orderby d.Name
                  select d);
    var list = result.ToList();
    // ~15 seconds
}

However, when the condition in the final query is this

where d.DevicePartNumberID == filterPartNumberID
& diodesWithTestData.Contains(d.ID)
& !diodeIDsInBlades.Contains(d.ID)
// milliseconds

it is very fast.

Comparing the SQL in result before calling ToList(), here are the queries (value 71 manually added in place of @params)

-- MULTIPLE WHERE
SELECT [t0].[ID], [t0].[Name], [t0].[M2MID], [t0].[DevicePartNumberID], [t0].[Comments], [t0].[Hold]
FROM [dbo].[Diode] AS [t0]
WHERE (NOT (EXISTS(
    SELECT NULL AS [EMPTY]
    FROM (
        SELECT DISTINCT [t2].[value]
        FROM (
            SELECT [t1].[DiodeID] AS [value]
            FROM [dbo].[BladeDiode] AS [t1]
            ) AS [t2]
        ) AS [t3]
    WHERE [t3].[value] = [t0].[ID]
    ))) AND (EXISTS(
    SELECT NULL AS [EMPTY]
    FROM (
        SELECT DISTINCT [t6].[value]
        FROM (
            SELECT [t4].[DeviceID] AS [value], [t5].[DevicePartNumberID]
            FROM [dbo].[Test] AS [t4]
            INNER JOIN [dbo].[TestType] AS [t5] ON [t4].[TestTypeID] = ([t5].[ID])
            ) AS [t6]
        WHERE [t6].[DevicePartNumberID] = (71)
        ) AS [t7]
    WHERE [t7].[value] = [t0].[ID]
    )) AND ([t0].[DevicePartNumberID] = 71)
ORDER BY [t0].[Name]

and

-- SINGLE WHERE
SELECT [t0].[ID], [t0].[Name], [t0].[M2MID], [t0].[DevicePartNumberID], [t0].[Comments], [t0].[Hold]
FROM [dbo].[Diode] AS [t0]
WHERE ([t0].[DevicePartNumberID] = 71) AND (EXISTS(
    SELECT NULL AS [EMPTY]
    FROM (
        SELECT DISTINCT [t3].[value]
        FROM (
            SELECT [t1].[DeviceID] AS [value], [t2].[DevicePartNumberID]
            FROM [dbo].[Test] AS [t1]
            INNER JOIN [dbo].[TestType] AS [t2] ON [t1].[TestTypeID] = ([t2].[ID])
            ) AS [t3]
        WHERE [t3].[DevicePartNumberID] = (71)
        ) AS [t4]
    WHERE [t4].[value] = [t0].[ID]
    )) AND (NOT (EXISTS(
    SELECT NULL AS [EMPTY]
    FROM (
        SELECT DISTINCT [t6].[value]
        FROM (
            SELECT [t5].[DiodeID] AS [value]
            FROM [dbo].[BladeDiode] AS [t5]
            ) AS [t6]
        ) AS [t7]
    WHERE [t7].[value] = [t0].[ID]
    )))
ORDER BY [t0].[Name]

The two SQL queries execute in < 1 second in SSMS and produce the same results.

So I'm wondering why the first is slower on the LINQ side. It's worrying to me because I know I've used multiple where elsewhere, without being aware of a such a severe performance impact.

This question even has answered with both multiple & and where. And this answer even suggests using multiple where clauses.

Can anyone explain why this happens in my case?

ndrwnaguib
  • 5,623
  • 3
  • 28
  • 51
djv
  • 15,168
  • 7
  • 48
  • 72
  • You need to look at the execution plan. However, looking at the 2 queries, the 2nd query (single where) first searches for `[t0].[DevicePartNumberID] = 71` and then performs other conditions. The first query does all the other queries (corelated queries) and then looks for `[t0].[DevicePartNumberID] = 71`. That is just what I see from looking at the code. It is possible that I am wrong and the plan generated is totally different than what I have said. – CodingYoshi Dec 21 '17 at 22:32
  • You wouldn't write this by hand so don't ask the ORM to do it for you. Try to avoid the nested selects using join (avoid the .contains). This will make the generated sql simpler and faster – George Vovos Dec 21 '17 at 22:36
  • Did you mean to put `&` in the second condition and not `&&` ? – NetMage Dec 21 '17 at 22:38
  • @GeorgeVovos please explain your reasoning for *avoid the .contains*. The idea with ORMs are that one should not care what code is produced by the ORM: If things are slow, then you can look into it to see what has been generated and see if it can be improved. Otherwise, the code written on the app side gets read by devs so that code should be written in a manner wherein it is easy to read (regardless of what the ORM spits out). – CodingYoshi Dec 21 '17 at 22:40
  • The SQL queries are the same except they have the three closes in the WHERE in different orders, so that is why they run in similar time. Something else must be happening to make the result take differing amounts of time. Can you get a copy of LINQPad and run your LINQ queries in that, and look at the execution time there? – NetMage Dec 21 '17 at 22:46
  • 2
    @CodingYoshi *the idea with ORMs are that one should not care what code is produced by the ORM*: That is completely wrong (for ORMs and any other tool). I'm not talking about contains is general,I'm talking about the specific example.If the OP removes the diodeIDsInBlades and diodesWithTestData variable an use joins in the main query EF will probably produce more efficient sql – George Vovos Dec 21 '17 at 22:49
  • @GeorgeVovos I do not want to hijack this thread but when you say that *the idea with ORMs are that one should not care what code is produced by the ORM is completely wrong*, is not correct. If you code with that in mind, you will end up writing Linq which is harder to read. Anyhow, like I said, I don't want to hijack this thread. – CodingYoshi Dec 21 '17 at 22:56
  • @NetMage both `&` and `&&` will be treated as sql `AND` so there is no difference on the db side. – CodingYoshi Dec 21 '17 at 22:57
  • @CodingYoshi You could make this a question.You are very right in that the query must be readable,you should also avoid premature optimizations but there a long way between premature optimization and *not care what code is produced by the ORM * .With ORMs the first thing you should worry about is the generated query. Efficient doesn't mean unreadable.You mainly want to avoid select * and many nested select statements – George Vovos Dec 21 '17 at 23:01
  • there is a difference between writing a bad query and writing code that is difficult to read. ORMs (like EF) making doing both easy. – Matthew Whited Dec 21 '17 at 23:24
  • The generated sql in each case takes less than one second to execute in ssms. What is baffling to me is how the first one takes so long when coming from linq, but is instantaneous in ssms. I understand the query doesn't look great but I had originally intended to perform outer joins and then I had a tradeoff between code readability and performance (which ends up being negligible with the single where clause) – djv Dec 22 '17 at 00:47
  • If you would like help with writing an efficient query, post your entity classes and your desired result. My guess is you can get the proper joins and avoid all those .Contains(). BTW: your issue is: multiple From statements not "multiple where" – JohnWrensby Dec 22 '17 at 02:31
  • If your generated SQL runs fast and you clear your SQL query cache and it still runs fast. That is because the time is your query compilation at runtime, nothing to do with SQL. I know that doesn't solve your issue, I was just answering your "What is baffling..." – JohnWrensby Dec 22 '17 at 02:35

1 Answers1

0

Because writing like this

if (someParam1 != 0)
{
    myQuery = myQuery.Where(q => q.SomeField1 == someParam1)
}
if (someParam2 != 0)
{
    myQuery = myQuery.Where(q => q.SomeField2 == someParam2)
}

is NOT(upd) the same as (in case when someParam1 and someParam2 != 0)

myQuery = from t in Table
          where t.SomeField1 == someParam1
             && t.SomeField2 == someParam2
          select t;

is (NOT deleted) the same as

myQuery = from t in Table
          where t.SomeField1 == someParam1
          where t.SomeField2 == someParam2
          select t;

UPD

Yes, I do mistake. Second query is same, first is not same.

First and Second queries not EXACTLY the same. Let me show you what I mean.

1st query with lamda-expression writen as

t.Where(r => t.SomeField1 == someParam1 && t.SomeField2 == someParam2)

2nd query as

t.Where(r => r.SomeField1 == someParam1).Where(r => r.SomeField2 == someParam2)

In this case in generated SQL Predicate with SomeField2 goes first (it is important, see below)

In 1st case we getting this SQL:

SELECT <all field from Table>
  FROM table t
 WHERE t.SomeField1 = :someParam1
   AND t.SomeField2 = :someParam2

In 2 case the SQL is:

SELECT <all field from Table>
  FROM table t
 WHERE t.SomeField2 = :someParam2
   AND t.SomeField1 = :someParam1

As we see there are 2 'same' SQLs. As we see, the OP's SQLs are also 'same', they are different in order of predicates in WHERE clause (as in my example). And I guess that SQL optimizer generate 2 different execution plans and may be(!!!) doing NOT EXISTS, then EXISTS and then filtering take more time than do first filtering and after that do EXISTS and NOT EXISTS

UPD2

It is a 'problem' of Linq Provider (ORM). I'm using another ORM (linq2db), and it generates for me EXACTLY the same SQLs in both cases.

Hemid Abbasov
  • 175
  • 2
  • 5
  • Have you read the question carefully? The OP says both queries *produce the same results.*. So please explain why they are producing the same result but one takes longer than the other. – CodingYoshi Dec 21 '17 at 22:35
  • Actually that is not true. The third `myQuery` is converted by the compiler into exactly `Table.Where(t => t.SomeField1 == someParam1).Where(t => t.SomeField2 == someParam2)` and that is converted by the LINQ to SQL provider into `WHERE ... AND ...` exactly like the second query. – NetMage Dec 21 '17 at 22:35
  • Please, look at my UPD in answer – Hemid Abbasov Dec 21 '17 at 23:21