41

Which clause performs first in a SELECT statement?

I have a doubt in select query on this basis.

consider the below example

SELECT * 
FROM #temp A 
INNER JOIN #temp B ON A.id = B.id 
INNER JOIN #temp C ON B.id = C.id 
WHERE A.Name = 'Acb' AND B.Name = C.Name
  1. Whether, First it checks WHERE clause and then performs INNER JOIN

  2. First JOIN and then checks condition?

If it first performs JOIN and then WHERE condition; how can it perform more where conditions for different JOINs?

shA.t
  • 16,580
  • 5
  • 54
  • 111
MohanKrishnaRS
  • 423
  • 1
  • 4
  • 14
  • 3
    Perhaps this would help: http://stackoverflow.com/questions/879893/sql-order-of-operations – Dmitry Egorov Jun 10 '15 at 07:36
  • 4
    There is a *logical* processing order for operations but database systems are free to re-order those operations provided the final results produced are the same as if they have followed the logical processing order. You shouldn't care about the actual order, unless it's performing badly - you should tell the server "what you want", not "how to do it" – Damien_The_Unbeliever Jun 10 '15 at 07:57
  • This will help: http://stackoverflow.com/questions/19477950/are-inner-join-and-outer-join-necessary/19478161#19478161 – Deepshikha Jun 10 '15 at 08:52
  • Possible duplicate of [SQL order of operations](https://stackoverflow.com/questions/879893/sql-order-of-operations) – philipxy Apr 22 '19 at 19:06

5 Answers5

42

The conceptual order of query processing is:

1. FROM
2. WHERE
3. GROUP BY
4. HAVING
5. SELECT
6. ORDER BY

But this is just a conceptual order. In fact the engine may decide to rearrange clauses. Here is proof. Let's make 2 tables with 1000000 rows each:

CREATE TABLE test1 (id INT IDENTITY(1, 1), name VARCHAR(10))
CREATE TABLE test2 (id INT IDENTITY(1, 1), name VARCHAR(10))


;WITH cte AS(SELECT -1 + ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) d FROM
(VALUES(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) t1(n) CROSS JOIN
(VALUES(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) t2(n) CROSS JOIN
(VALUES(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) t3(n) CROSS JOIN
(VALUES(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) t4(n) CROSS JOIN
(VALUES(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) t5(n) CROSS JOIN
(VALUES(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) t6(n))

INSERT INTO test1(name) SELECT 'a' FROM cte

Now run 2 queries:

SELECT * FROM dbo.test1 t1
JOIN dbo.test2 t2 ON t2.id = t1.id AND t2.id = 100
WHERE t1.id > 1


SELECT * FROM dbo.test1 t1
JOIN dbo.test2 t2 ON t2.id = t1.id
WHERE t1.id = 1

Notice that the first query will filter most rows out in the join condition, but the second query filters in the where condition. Look at the produced plans:

1 TableScan - Predicate:[Test].[dbo].[test2].[id] as [t2].[id]=(100)

2 TableScan - Predicate:[Test].[dbo].[test2].[id] as [t2].[id]=(1)

This means that in the first query optimized, the engine decided first to evaluate the join condition to filter out rows. In the second query, it evaluated the where clause first.

Boops Boops
  • 713
  • 5
  • 8
Giorgi Nakeuri
  • 35,155
  • 8
  • 47
  • 75
24

Logical order of query processing phases is:

  1. FROM - Including JOINs
  2. WHERE
  3. GROUP BY
  4. HAVING
  5. SELECT
  6. ORDER BY

You can have as many as conditions even on your JOINs or WHERE clauses. Like:

Select * from #temp A 
INNER JOIN #temp B ON A.id = B.id AND .... AND ... 
INNER JOIN #temp C ON B.id = C.id AND .... AND ...
Where A.Name = 'Acb'
AND B.Name = C.Name
AND ....
shA.t
  • 16,580
  • 5
  • 54
  • 111
sqluser
  • 5,502
  • 7
  • 36
  • 50
5

you can refer to this join optimization

SELECT * FROM T1 INNER JOIN T2 ON P1(T1,T2)
                 INNER JOIN T3 ON P2(T2,T3)
  WHERE P(T1,T2,T3)

The nested-loop join algorithm would execute this query in the following manner:

FOR each row t1 in T1 {
  FOR each row t2 in T2 such that P1(t1,t2) {
    FOR each row t3 in T3 such that P2(t2,t3) {
      IF P(t1,t2,t3) {
         t:=t1||t2||t3; OUTPUT t;
      }
    }
  }
}
Jacky
  • 147
  • 1
  • 2
  • 7
3

You can refer MSDN

The rows selected by a query are filtered first by the FROM clause join conditions, then the WHERE clause search conditions, and then the HAVING clause search conditions. Inner joins can be specified in either the FROM or WHERE clause without affecting the final result.

You can also use the SET SHOWPLAN_ALL ON before executing your query to show the execution plan of your query so that you can measure the performance difference in the two.

shA.t
  • 16,580
  • 5
  • 54
  • 111
Rahul Tripathi
  • 168,305
  • 31
  • 280
  • 331
  • I had a 1:20 minute query with join, with LIMIT 0,10 , solved only removing join , maybe for INNER could be better and for LEFT not? – giuseppe Apr 03 '18 at 08:42
0

If you come to this site for the question about logical query processing, you really need to read this article on ITProToday by Itzik Ben-Gan.

Figure 3: Logical query processing order of query clauses

1 FROM 
2 WHERE 
3 GROUP BY 
4 HAVING 
5 SELECT
    5.1 SELECT list
    5.2 DISTINCT
6 ORDER BY 
7 TOP / OFFSET-FETCH
Weihui Guo
  • 3,669
  • 5
  • 34
  • 56