0

Using MS Access SQL, I am trying to figure out how to delete rows with duplicate values in one column.

Example Data:

Department      | Job Category        | Name | Requestor ID
Medical Affairs | Vice President      | Kim  | 123
Medical Affairs | President           | Abe  | 123
Nursing         | Sr Vice President   | Bob  | 456
Nursing         | Exec Vice President | Jim  | 456
Patient Care    | President           | Sam  | 456

In the example above, I want to delete the rows with duplicate Requestor ID to only leave the row with the lowest Job Category (even if the Job Category is in different Departments). That means I would only be left with these rows below when I'm done deleting my duplicates:

Results:

Department      | Job Category        | Name | Requestor ID
Medical Affairs | Vice President      | Kim  | 123
Nursing         | Sr Vice President   | Bob  | 456

I am pretty new to MS Access SQL so I don't even know where to start. I appreciate all the help I can get.

j3ff
  • 5,719
  • 8
  • 38
  • 51
  • 1
    Is there any logic around which department or category should be removed vs kept? – alexherm Apr 03 '20 at 22:45
  • 1
    Ideally you would create a rank table which outlines the Job Category and which is highest. Once you have this you can link the tables, sort them accordingly and deal with the duplicates. – Jim L Apr 04 '20 at 00:07

2 Answers2

0

How do you know what the "lowest" job category is?

Well, you can do something like this:

select e.*
from example as e
where e.jobcategory <> (select top (1) e2.jobcategory
                        from example e2
                        where e2.requestorid = e.requestorid
                        order by switch(jobcategory = 'President', 1,
                                        jobcategory = 'Exec Vice President', 2,
                                        jobcategory = 'Sr Vice President', 3,
                                        jobcategory = 'Vice President', 4,
                                        ) desc
                       );
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
0

I realized that I could get the results that I wanted by constructing the code below:

SELECT * 
FROM TableA 

UNION 

SELECT * 
FROM TableA AS a 
RIGHT JOIN TableB AS b 
ON a.[Requestor ID] = b.[Requestor ID] 
WHERE a.[Requestor ID] IS NULL

By using this code, I will get unique Requestor IDs and will not run into the issue of having duplicate Requestor IDs with different Departments, Job Categories and Names.

The code below caused the issue I was facing, whereby I got duplicate Requestor IDs with different Departments, Job Categories and Names since I was unable to filter out rows with Requestor IDs that were already in my original table (TableA).

SELECT *
FROM TableA

UNION

SELECT *
FROM TableB

Thanks, everyone, for your help!