2

It's hard for me to summarize the query I want to make, so maybe an example will make it clearer.

Let's say I have two primary tables:

employees:

| employee_id | employee_name |
| ----------- | ------------- |
| 1           | Alice         |
| 2           | Bob           |
| 3           | Carol         |

programming_languages:

| language_id | language_name |
| ----------- | ------------- |
| 1           | Python        |
| 2           | C++           |
| 3           | JavaScript    |

And a third table, with foreign keys from the first two, that says which employees know which programming languages:

employee_known_languages:

| employee_id | language_id |
| ----------- | ----------- |
| 1           | 1           |
| 2           | 1           |
| 2           | 2           |
| 2           | 3           |
| 3           | 3           |

So in this example Alice knows Python, Bob knows Python C++ and JavaScript, and Carol knows JavaScript.

If I wanted to write a query to get all the employees who know C++, I know what I'd want to do:

SELECT employee_name from
    employees INNER JOIN programming_languages INNER JOIN employee_known_languages
WHERE language_name = C++

What if I wanted to get all the employees who know C++ and Python? Or, more generally, some arbitrary subset of the values of language_name?

Ideally I'd be able to write this in a way where I can write a Python function that accepts a list of arguments and does the query, but I think just being able to figure out the SQL itself would help.

forpas
  • 160,666
  • 10
  • 38
  • 76
Nathan Pierson
  • 5,461
  • 1
  • 12
  • 30

2 Answers2

2

Use a CTE that returns an arbitrary subset of the values of language_name and filter the results of the joins of the 3 tables for these language_names only.
Then use aggregation and set the condition in the HAVING clause:

WITH cte(language_name) AS (VALUES ('C++'), ('Python'))
SELECT e.* 
FROM employees e
INNER JOIN employee_known_languages ep ON ep.employee_id = e.employee_id
INNER JOIN programming_languages p ON p.language_id = ep.language_id
WHERE p.language_name IN cte
GROUP BY e.employee_id
HAVING COUNT(*) = (SELECT COUNT(*) FROM cte);

See the demo.

forpas
  • 160,666
  • 10
  • 38
  • 76
  • Was able to adapt this to my project, including the ability to have the set of target values be of unknown length. Very cool. – Nathan Pierson Aug 08 '22 at 22:09
1

I believe a nested query is required. First get the matches, then count them.

SELECT employee_name, COUNT(*) AS count FROM (
    SELECT e.employee_name, pl.language_name 
    FROM employees e 
    INNER JOIN programming_languages pl 
    INNER JOIN employee_known_languages ekl 
    ON (ekl.employee_id=e.employee_id) and (ekl.language_id = pl.language_id)
    WHERE pl.language_name IN ('Python', 'C++')
)
GROUP BY employee_name
HAVING count > 1;

Output:

sqlite> .read x.sql
Bob|2
sqlite>
Tim Roberts
  • 48,973
  • 4
  • 21
  • 30