3

Suppose I have a table

 Table A
 ---------------
 id    date_id (yyyyMMdd format)
 ---------------
 1     20120101
 2     20120102
 3     20120103

What is the difference between doing:

 select a1.* from A a1
 left outer join A a2
     on a1.id = a2.id 
     and a1.date_id < a2.date_id
 where a1.date >= 20120103
 and a2.id is null

And

 select a1.* from A a1
 left outer join A a2
      on a1.id = a2.id
      and a1.date_id < a2.date_id
      and a1.date_id >=20120103
 where a2.id is null

For the first query I get

 id   date_id
 --------------
 3    20120103

As expected, but for the second one I get (a2 columns not selected in query but shown here for clarity)

 a1.id   a1.date_id  a2.id   a2.date_id
 ---------------------------------------
 1       20120101    
 1       20120101    
 2       20120102
 2       20120102
 3       20120103

Shouldn't the second query also filter by a1.date_id >= 20120103? Why is it returning rows with date_id's 20120101 and 20120102?

fo_x86
  • 2,583
  • 1
  • 30
  • 41

2 Answers2

1

In the case of outer join, ON clause will have no effect on outer table. This means that all rows from outer table will be returned and ON clause just determine which row of a joined table joins outer table. And those rows of outer table that do not meet condition(s) in the ON clause simply will be extended with null values for columns of the table that is being joined. WHERE clause filters the final result set.

Nick Krasnov
  • 26,886
  • 6
  • 61
  • 78
0

In the first query, it will first fetch the result set of on

a1.id = a2.id 
     and a1.date_id < a2.date_id and then it will apply the `where clause` in the result set and so you get '20120103'.

In the second query you are taking all the records that satisfy

on a1.id = a2.id
      and a1.date_id < a2.date_id
      and a1.date_id >=20120103

and thats why you get those many rows. Hope this helps you.

Mariappan Subramanian
  • 9,527
  • 8
  • 32
  • 33