2

I am trying to write a query in Oracle SQL that takes two parameters and finds throughout the table all of the instances where only either or occur, regardless of how many of them there are within the search. Here's an example of what I'm looking for:

| ID  | FileType | COUNT(FileType) |
|-----|----------|-----------------|
| 1   | txt      | 1               |
| 1   | png      | 3               |
| 1   | jpg      | 2               |
====================================
| 2   | txt      | 0               |
| 2   | png      | 6               |
| 2   | jpg      | 0               |
====================================
| 3   | txt      | 0               |
| 3   | png      | 0               |
| 3   | jpg      | 5               |
====================================
| 4   | txt      | 0               |
| 4   | png      | 3               |
| 4   | jpg      | 1               |
====================================
| 5   | txt      | 5               |
| 5   | png      | 0               |
| 5   | jpg      | 3               |

My goal is to take all the ID's with ONLY png's OR jpg's and omit the rest, so I only want ID's 2, 3, and 4 to be returned.

I've tried to look up solutions here and haven't found anything that seems to match my case. (Closest being this question: SQL select rows with only a certain value in them)

I did manage to discover that the problem can be achieved through relational division, but I haven't had any experience with it thus far. So far my query looks like this:

SELECT DISTINCT ID, FileType, COUNT(FileType)
FROM Table ta
WHERE (ta.FileType = 'jpg' or ta.FileType = 'png') and
NOT EXISTS
    (SELECT *
     FROM Table tb
     WHERE ta.FileType = tb.FileType and
     (tb.FileType != 'jpg' or tb.FileType != 'png'))
GROUP BY ID, FileType;

When I try this, I get no results. Anyone have any ideas where I went wrong here?

William Robertson
  • 15,273
  • 4
  • 38
  • 44
Cody
  • 49
  • 9

3 Answers3

2

You could use set operators:

SELECT ID FROM tab WHERE FileType IN ('jpg', 'png')
MINUS
SELECT ID FROM tab WHERE FileType NOT IN ('jpg', 'png')

Assumption: FileType is not nullable.


Handling NULL:

SELECT ID FROM tab WHERE FileType IN ('jpg', 'png')
MINUS
SELECT ID 
FROM (SELECT * FROM tab WHERE FileType IS NOT NULL) 
WHERE FileType NOT IN ('jpg', 'png')
Lukasz Szozda
  • 162,964
  • 23
  • 234
  • 275
  • 1
    This is a good and nice query. You can improve it by writing it in this way: `SELECT ID FROM tab WHERE FileType IN ('jpg', 'png') minus SELECT ID FROM tab WHERE FileType NOT IN ('jpg', 'png')` – Florin Ghita Jan 14 '19 at 16:11
  • 1
    This answer express the best the request. Pure and powerful SQL. – Florin Ghita Jan 14 '19 at 16:12
  • 1
    Thank you, this is the clearest and easiest solution to understand. I believe this will work. – Cody Jan 14 '19 at 16:20
  • My apologies for necroing so quickly and for not realizing sooner, but if the FileType is Nullable, conventional wisdom tells me that replacing IN with '= (jpg or png)' and NOT IN with '!= (jpg or png)' would be correct, but this does not seem to be the case. Any ideas in this situation? – Cody Jan 14 '19 at 16:43
  • 1
    @Cody Null values could be removed using inlined view. – Lukasz Szozda Jan 14 '19 at 16:46
  • Thank you for all the help! – Cody Jan 14 '19 at 16:49
1

You are close. Just see how many distinct filetypes you have. Here are your IDs:

SELECT ID
FROM Table ta
GROUP BY ID
HAVING count(distinct FileType) = 1 and max(ta.FileType) in ('jpg','png');

Update:The upper will fail for case 4.

This will do it but it is ugly:

SELECT ID
FROM Table ta
GROUP BY ID
HAVING count(distinct FileType) <= 2 
   and max(ta.FileType) in ('jpg','png') 
   and min(ta.FileType) in ('jpg','png');

It is ugly because you cannot extend it for 3 values.

Florin Ghita
  • 17,525
  • 6
  • 57
  • 76
0

Just use a "helper" table:

WITH TypeCounts AS
(
  SELECT ID, FileType, COUNT(*) AS CNT
  FROM Table
  GROUP BY ID, FileType
)
SELECT *
FROM Table
LEFT JOIN TypeCounts txt ON txt.ID = Table.ID AND txt.FileType = 'txt'
LEFT JOIN TypeCounts jpg ON jpg.ID = Table.ID AND jpg.FileType = 'jpg'
LEFT JOIN TypeCounts png ON png.ID = Table.ID AND png.FileType = 'png'
WHERE COALESCE(txt.CNT,0) = 0 AND
    ( COALESCE(jpg.CNT,0) > 0 OR COALESCE(png.CNT,0) > 0)

What is nice about this solution is that it is very clear what the business rules are and is therefore easier to maintain.

Hogan
  • 69,564
  • 10
  • 76
  • 117