0

I have a directory table in a MySQL database that contains user ID's, user types, and assorted other directory information. There are two types of users: employee and student. The primary key is (userID, userType), so any given user can have an employee record and a student record (in the event that they are both an employee and a student).

I would like to perform queries on a subset of this table, so that:

  • If a user has only an employee xor a student record, that record is used, and
  • If a user has both an employee and a student record, the employee record is used and the student record is ignored

I will not need to include an ORDER BY clause in the queries, but they can include rather complicated WHERE clauses (including queries on userType itself).

Ideally, I would like to do this without having to create extra tables or views, but if an extra view is necessary I can ask the sysadmin for CREATE VIEW permissions.

LeafStorm
  • 3,057
  • 4
  • 24
  • 28

3 Answers3

2

I'd do next, If I need it with minimum DDL's -

Select * From YourTable where UserType='Employee'
Union 
Select * from YourTable s where s.UserType='Student' and s.UserId not in (Select UserId from YourTable where UserType='Employee')

First will pick employees, second - students only. Did not test, but it should work for t-sql.

If you do not like 'not in'

Select * From YourTable where UserType='Employee'
Union 
Select distinct s.* from YourTable s
   left join YourTable e on s.UserId=e.UserId and s.UserType = 'Student' and e.UserType='Employee'
Where e.UserId is null
Val Bakhtin
  • 1,434
  • 9
  • 11
  • +1 Good answer, but you can use UNION ALL instead of UNION, just in case MySQL is too dumb to see that no rows can actually match (and tries to remove the "duplicates", wasting performance). – Branko Dimitrijevic May 14 '12 at 19:04
  • I would be careful with the "not in". Sometimes these are optimized very poorly by the SQL compiler. – Gordon Linoff May 14 '12 at 19:06
  • @GordonLinoff Yes, when calculating plan for [NOT] IN or EXISTS, MySQL will often (incorrectly) pick nested loops instead of merge join. The question is: can we do better than nested loops in this case? OP should measure both this and your answer (+1 BTW). – Branko Dimitrijevic May 14 '12 at 19:54
  • Interestingly, it is not only mysql that has problems with "not in". Over the years, I've just given up on using that construct when the list is a table or subquery. I just use the left outer join version. And, yes, if he is interested in performance he can time the queries or at least post the execution plans. – Gordon Linoff May 14 '12 at 20:09
  • @GordonLinoff In my experience, "serious" databases such as Oracle, MS SQL Server, PostgreSQL, DB2 etc. will produce the same plan for the same meaning of the query. People are often [not careful about NULLs](http://www.sqlbadpractices.com/using-not-in-operator-with-null-values/) and unintentionally express a subtly different meaning with NOT IN (from what a JOIN would mean) and then unjustifiably blame NOT IN for bad performance. – Branko Dimitrijevic May 15 '12 at 16:01
1

I recommend that you have a table that has one row per user with the appropriate user type. In this case, you can simlpy take min(userType), since that chooses employee when both are present.

Then, you can use this in a query. Here is an example:

select table.*
from table join
     (select userid, min(usertype) as usertype
      from table
      group by userid
     ) users
     on table.userid = users.userid and
        table.usertype = users.usertype
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
0

Didn't actually test it, but something like this should work:

SELECT *
FROM directory
WHERE
    userType = 'employee'
    OR userID NOT IN (
        SELECT userID
        FROM directory
        WHERE userType = 'employee'
    )

In plain English: select all employees (left of OR) and all students for which there are no corresponding employees (right of OR).

Branko Dimitrijevic
  • 50,809
  • 10
  • 93
  • 167
  • I ended up going with something like this, but with `usertype = 'student' AND NOT EXISTS (SELECT userID FROM directory WHERE userType = 'employee')` instead of what you had after the `OR`. I'm not very good at reading EXPLAIN output, but it appears to me that MySQL can optimize the `EXISTS` version better. Thanks for the idea! – LeafStorm May 15 '12 at 15:17