0

I have a table, "posts" like so:

+----+---------+------------+
| id |  author |    text    |
+----+---------+------------+
| 1  |   bob   |     hello  |
| 2  |   jim   |     hi bob |
+----+---------+------------+

and also "tags" like so:

+----+--------------+
| id |     name     |
+----+--------------+
| 1  |  bugs        |
| 2  |  project_foo |
| 3  |  project_bar |
+----+--------------+

and lastly "post_tags" like:

+------+------+
| post |  tag |
+------+------+
| 1    |  1   |
| 1    |  2   |
| 2    |  1   |
| 2    |  3   |
+------+------+

Posts are tagged with any number of tags. With the above data, what is the best way for me to query "give me the posts that have ALL tags in the list (1,2)"? These would be the posts tagged as "bugs, project_foo"

A joined search with "IN" doesn't work for me as it is returning posts that match any of the listed tags, so I will receive "project_bar" bug posts as well even though they aren't even tagged as "project_foo". It would be completely fine to return posts that contain tags not in the specified list, given that the post at least has the tags mentioned. So "bugs, project_foo, project_bar" would be fine, but "bugs, project_bar" would not.

I would prefer to do this with normal ResultSet usage, but am comfortable doing in a custom ResultSource::View. Problem is, I can't figure out how to do this in raw sql either.

Is there a practical way to do what I am attempting, or is my table setup just wrong for what I am trying to achieve?

Thank you.

dachi
  • 1,604
  • 11
  • 15
FoohonPie
  • 11
  • 1
  • For the SQL statement, you could use `GROUP BY` and `HAVING` like this: `SELECT post FROM post_tags WHERE tag IN (1, 2) GROUP BY post HAVING COUNT(post) = 2` – nwellnhof Mar 04 '14 at 17:15

2 Answers2

0

The SQL statement would be:

  SELECT post
    FROM post_tags
   WHERE tag IN (1, 2)
GROUP BY post
  HAVING COUNT(post) = 2

So something like the following should work with DBIx::Class:

my @tags = (1, 2);

$schema->resultset('PostTags')->search({
    tag => { -in => \@tags },
}, {
    columns  => 'post',
    group_by => 'post',
    having   => \[ 'COUNT(post) = ?', scalar(@tags) ],
});

This would only retrieve the post ids. You could try to add a join or prefetch to retrieve the post contents in a single statement.

nwellnhof
  • 32,319
  • 7
  • 89
  • 113
0

I'd go with this query. It'd have a good performance but use it only if you can be sure you'll only need to check for only 2 values. If your requirements could expand to include a third or even fourth one then nwellnhof's answer is better:

 SELECT post
    FROM post_tags a
   WHERE EXISTS (
         SELECT 1 FROM post_tags b1
         WHERE a.post = b1.post
         AND b1.tag=1 )
   AND EXISTS (
         SELECT 1 FROM post_tags b2
         WHERE a.post = b2.post
         AND b2.tag=2 );
Joe Pineda
  • 5,521
  • 3
  • 31
  • 40