I have the following tables in MS SQL Server: Tasks, Users, Tags, TaskTags (maps a task to a tag), and UserTags (maps a user to a tag).
Given a User U, I want to find all tasks T where every tag of T is also a tag of U (e.g. a task should be returned if its tags are a subset of the user's tags).
Here is a table script with some sample data (it can be run at http://sqlfiddle.com/ with MS SQL Server 17):
CREATE TABLE [dbo].[Tasks](
[TaskId] [int] NOT NULL PRIMARY KEY,
[TaskName] [nvarchar](MAX) NOT NULL
)
CREATE TABLE [dbo].[Users](
[UserId] [int] NOT NULL PRIMARY KEY,
[UserName] [nvarchar](MAX) NOT NULL
)
CREATE TABLE [dbo].[Tags](
[TagId] [int] NOT NULL PRIMARY KEY,
[TagName] [nvarchar](MAX) NOT NULL
)
CREATE TABLE [dbo].[TaskTags](
[TaskId] [int] NOT NULL,
[TagId] [int] NOT NULL
)
CREATE TABLE [dbo].[UserTags](
[UserId] [int] NOT NULL,
[TagId] [int] NOT NULL
)
INSERT INTO Tasks VALUES (1,'Task for all SWEs');
INSERT INTO Tasks VALUES (2,'Task for USA SWEs');
INSERT INTO Tasks VALUES (3,'Task for all PMs');
INSERT INTO Tasks VALUES (4,'Task for Europe PMs');
INSERT INTO Users VALUES (1,'Europe SWE');
INSERT INTO Users VALUES (2,'USA SWE');
INSERT INTO Users VALUES (3,'Europe PM');
INSERT INTO Users VALUES (4,'USA PM');
INSERT INTO Tags VALUES (1,'swe');
INSERT INTO Tags VALUES (2,'pm');
INSERT INTO Tags VALUES (3,'usa');
INSERT INTO Tags VALUES (4,'europe');
INSERT INTO TaskTags VALUES (1,1);
INSERT INTO TaskTags VALUES (2,1);
INSERT INTO TaskTags VALUES (2,3);
INSERT INTO TaskTags VALUES (3,2);
INSERT INTO TaskTags VALUES (4,2);
INSERT INTO TaskTags VALUES (4,4);
INSERT INTO UserTags VALUES (1,1);
INSERT INTO UserTags VALUES (1,4);
INSERT INTO UserTags VALUES (2,1);
INSERT INTO UserTags VALUES (2,3);
INSERT INTO UserTags VALUES (3,2);
INSERT INTO UserTags VALUES (3,4);
INSERT INTO UserTags VALUES (4,2);
INSERT INTO UserTags VALUES (4,3);
I was able to figure out the inverse of this problem, when the Task T is given. E.g. given Task T, return all Users U where the tags of T are a subset of U. Here is that query:
WITH thisTaskTags AS (
SELECT DISTINCT TaskTags.TagId
FROM TaskTags
WHERE TaskTags.TaskId = @taskId
)
SELECT UserTags.UserId
FROM UserTags JOIN thisTaskTags
ON UserTags.TagId = thisTaskTags.TagId CROSS JOIN
(SELECT COUNT(*) AS keycnt FROM thisTaskTags) k
GROUP BY UserTags.UserId
HAVING COUNT(thisTaskTags.TagId) = MAX(k.keycnt)
When @taskId = 1, UserIds 1 and 2 are returned, and when @taskId = 2, only UserId 2 is returned (correct behavior).
However when I tried to convert this to returning all tasks a given user should have, I ran into trouble. I tried this query:
WITH thisUserTags AS (
SELECT DISTINCT UserTags.TagId
FROM UserTags
WHERE UserTags.UserId = @userId
)
SELECT TaskTags.TaskId
FROM TaskTags JOIN thisUserTags
ON thisUserTags.TagId = TaskTags.TagId CROSS JOIN
(SELECT COUNT(*) AS keycnt FROM thisUserTags) k
GROUP BY TaskTags.TaskId
HAVING COUNT(thisUserTags.TagId) = MAX(k.keycnt);
However this only returns tasks where all the task tags match all the user tasks, e.g. if U had tags: [a,b,c] it would only get tasks with tags: [a,b,c] instead of [a], [b], [b,c], etc.
With concrete examples, if you set @userId = 1, no task IDs are returned, when the correct output would be getting 1 row, Task ID = 1. And when @userId = 2, only taskID 2 is returned, when both taskIDs 1 and 2 should be returned (i.e. if a task only has the "swe" tag, all "swe" users should get it, but if a task has both "swe" and "usa", only users who have both of those tags should get it).
I also tried this query:
SELECT DISTINCT Tasks.TaskId FROM Tasks
INNER JOIN TaskTags ON TaskTags.TaskId = Tasks.TaskId
WHERE TaskTags.TagId IN (SELECT TagId from UserTags where UserId = @userId)
GROUP BY Tasks.TaskId
But the issue with this is it returns any task that has any tag in common, so U with tags: [a,b,c] would get T with tags: [b,d] even though U doesn't have tag d.
Again with concrete examples, if @userId = 1, taskIDs 1,2, and 4 are returned, when only taskIds 1 and 2 should be returned (task ID 4 should only be assigned to users with both tags "europe" and "pm", here it is erroneously being assigned to a user with tags "europe" and "swe" due to the common "europe" tag).
Could someone shed some light here?