I have a simple two column table. For example's sake we can use the following to build the data:
CREATE TABLE Duplicates
(assignmentid varchar(5), questionid varchar(5));
INSERT INTO Duplicates
(assignmentid, questionid)
VALUES
('aaaaa', '11111'),
('aaaaa', '22222'),
('bbbbb', '22222'),
('bbbbb', '33333'),
('bbbbb', '33333');
There are two rows that are identical. There is also a question that appears on multiple assignments. The latter is a valid scenario, and I am trying to get a query of all the questions that are part of multiple assignments. So my desired output is:
aaaaa, 22222
bbbbb, 22222
I was able to get this results with this:
SELECT main.questionid, sub.assignmentid
FROM (
SELECT questionid, count(assignmentid) AS AssignmentCount
FROM (
SELECT DISTINCT questionid, assignmentid
FROM Duplicates
) sub
GROUP BY questionid
HAVING AssignmentCount > 1
) main
INNER JOIN (
SELECT DISTINCT questionid, assignmentid
FROM Duplicates
) sub ON main.questionid = sub.questionid;
As you can see the DISTINCT subquery is being repeated twice. I could avoid this by using the WITH command, but my understanding is that this does not necessarily mean that the subquery will be executed only once. So now I am here in StackOverflow, to ask if anyone knows a more efficient way to run this query.