6

I've got the following query:

select distinct a.id, a.name
from Employee a
join Dependencies b on a.id = b.eid
where not exists 
    ( 
select * 
    from Dependencies d 
    where b.id = d.id 
    and d.name  = 'Apple'
    )
and exists 
    (
    select * 
    from Dependencies c 
    where b.id = c.id 
    and c.name  = 'Orange'
    );

I have two tables, relatively simple. The first Employee has an id column and a name column The second table Dependencies has 3 column, an id, an eid (employee id to link) and names (apple, orange etc).

the data looks like this Employee table looks like this

id  | name
-----------
1   | Pat
2   | Tom
3   | Rob
4   | Sam

Dependencies

id  | eid | Name
--------------------
1   | 1   |  Orange
2   | 1   |  Apple
3   | 2   |  Strawberry
4   | 2   |  Apple
5   | 3   |  Orange
6   | 3   |  Banana

As you can see Pat has both Orange and Apple and he needs to be excluded and it has to be via joins and i can't seem to get it to work. Ultimately the data should only return Rob

Linger
  • 14,942
  • 23
  • 52
  • 79
YelizavetaYR
  • 1,611
  • 6
  • 21
  • 37

3 Answers3

13

Inner join with the name you want, left join on the name you dont, then use where to ensure the left join fails to match, like so (SQL Fiddle):

select distinct a.id, a.name
from Employee a
  inner join Dependencies b on a.id = b.eid
    and b.name = 'Orange'
  left join Dependencies c on ( a.id = c.eid
    and c.name = 'Apple')
where c.id is null;
Linger
  • 14,942
  • 23
  • 52
  • 79
Jaaz Cole
  • 3,119
  • 1
  • 15
  • 20
  • Is it okay to have multiple joins on Dependencies with various aliases? and i'm not sure what c.id is null means? – YelizavetaYR Jun 10 '14 at 15:27
  • Yes, you'll end up with a table (or likely index) spool in the query plan (a redundancy operation when a table is used twice, but you were using it 3 times before), but the operation is perfectly legal. As to the id is null - the left join fills the columns of C with null when the condition fails to match, rather than eliminate rows. Thus, c.id is null when the join condition fails. Go look up the difference between different types of joins. Hope this helps. – Jaaz Cole Jun 10 '14 at 15:29
  • @YelizavetaYR So write it this way... "SELECT DISTINCT a.id a_id,a.name,c.id c_id from... where 1=1" – Strawberry Jun 10 '14 at 15:30
  • 1
    As a side note: `select a.x from a left join b on a.x = b.x where b.x is null` is the idiom sometimes used in MySQL to simulate the *minus* operator, which is not implemented in MySQL. – proskor Jun 10 '14 at 15:38
  • 2
    @stawberry please don't encourage that where 1=1 crap. There is no reason to ever add that to a query. – HLGEM Jun 10 '14 at 15:42
0

Two joins to Dependencies will be needed, as there are 2 tests. Neglecting performance for a moment, you could try to improve the understandability of the joins by naming the aliases, e.g.:

SELECT DISTINCT e.ID, e.Name
   FROM Employee e
   LEFT OUTER JOIN Dependencies withApple
      ON withApple.eid = e.id
      AND withApple.Name = 'Apple'
   LEFT OUTER JOIN Dependencies withOrange
      ON withOrange.eid = e.id
      AND withOrange.Name = 'Orange'
   WHERE
      withApple.id IS NULL -- Don't want this
      AND
      withOrange.id IS NOT NULL -- Do want this.

SqlFiddle

StuartLC
  • 104,537
  • 17
  • 209
  • 285
  • Though it works, I think it is actually worse. Using *left outer join* only makes sense if you have to deal with missing rows or you want to calculate the difference. Since `id` is the column you join on, the condition `id is not null` is pointless. And besides, the comments "do want this" and "don't want this" do not help either. Shouldn't they be the other way round? As the case may be, it just proves the point. – proskor Jun 11 '14 at 09:38
  • You are right - the naming of the aliases preempted the filtering and conveyed a confusing message. The idea behind LOJ as opposed to IJ was to try and delay the filtering to the `WHERE` clause, and to confer the same symmetry to the unwanted apple as to the wanted Orange. But you are right, this isn't good Sql, the intention was just an interim stepping stone which hopefully would click something in the OP's understanding. – StuartLC Jun 11 '14 at 10:34
  • 1
    Yeah, ironically, in my opinion the original query with *exists subqueries* (except the useless join) conveyed the intention better than any proposed solution with *joins*, so I wouldn't even try to change that and leave it as it is. :) – proskor Jun 11 '14 at 11:01
0

Another version, similar to Jaaz Cole's one is:

select distinct a.id, a.name
   from Employee a
inner join Dependencies b on a.id = b.eid
   and b.name = 'Orange'
left join Dependencies c on a.id = c.eid
where (c.id is null or c.name != 'Apple');

In substance this comes from logics: The NEGATION(A & B) = NEGATION(A) OR NEGATION(B)

nicolass
  • 526
  • 5
  • 8