27

How do I select posts that contain a specific tag if there is a many-to-many relationship between posts and tags?

The problem I am having is that because of the where tag.name = 'xxx', only that tag is selected. I want to select all posts that have the tag specified, together with all of their tags, e.g.,

Post 1 -> tag1, tag2
Post 2 -> tag1, tag3
Post 3 -> tag2, tag3

Currently what I get is:

Post 1 -> tag2 // missing tag1
Post 3 -> tag2 // missing tag3 
1201ProgramAlarm
  • 32,384
  • 7
  • 42
  • 56
Jiew Meng
  • 84,767
  • 185
  • 495
  • 805

1 Answers1

34

Assuming these tables:

  • Posts: id, author, date, content
  • Tags: id, name
  • PostTags: post_id, tag_id

The last table is often called a join table and facilitates a many-to-many relationship between Posts and Tags.

SELECT p.*
FROM posts p
JOIN posttags pt ON p.id = pt.post_id
JOIN tags t ON pt.tag_id = t.id
WHERE t.name = 'sql'

Basically, think of a many-to-many relationship as two one-to-many relationships, because that's how they're implemented in normal RDBMSs. So the above query has a one-to-many join from Posts to PostTags and another from Tags to PostTags.

The PostTags table I created has a composite primary key, being (post_id, tag_id). That combination will be unique. Many disfavour composite keys so you'll often see people creating a primary key column:

  • PostTags: id, post_id, tag_id

Either method is fine. It's largely a philosophical difference.

Update: if you want to select all the posts that have a particular tag and all the tags those posts have then:

SELECT p.*
FROM posts p
JOIN posttags pt ON p.id = pt.post_id
JOIN tags t ON pt.tag_id = t.id
WHERE p.id IN
  (SELECT post_id
  FROM PostTags pt
  JOIN tags t ON pt.tag_id = t.id
  WHERE t.name = 'xyz')

Another way to do this is:

SELECT p.*
FROM posts p
JOIN posttags pt ON p.id = pt.post_id
JOIN tags t ON pt.tag_id = t.id
WHERE EXISTS
  (SELECT post_id
  FROM PostTags pt
  JOIN tags t ON pt.tag_id = t.id
  WHERE t.name = 'xyz'
  AND pt.post_id = p.id)

Which performs better will need to be tested and may vary depending on database vendor and version. A good optimizer (ie Oracle) will probably optimize them to perform the same. Others may not.

Now this will get you rows back like this:

Post 1, tag 1
Post 1, tag 2
Post 3, tag 2
Post 3, tag 3

so you'll need to combine them, preferably in application logic rather than SQL. Some RDBMSs have vendor-specific extensions for this kind of thing, like MySQL's GROUP_CONCAT() function.

John
  • 15,418
  • 12
  • 44
  • 65
cletus
  • 616,129
  • 168
  • 910
  • 942
  • i did just `SELECT post, tags FROM Application\Models\Post post INNER JOIN post.tags tags` b4 i saw this post, and it seem to work. without the tag name tho. is there any difference between yours and mine? – Jiew Meng Aug 03 '10 at 10:38
  • 1
    i also find that because of the where clause, only the tag specified there is selected. i want to get posts with that tag. but for the matched posts, i want all, not just the specified tag. maybe i work on a subquery – Jiew Meng Aug 03 '10 at 10:45
  • @jiewmeng I don't understand your questions. – cletus Aug 03 '10 at 11:30
  • the difference between the 2 is just the last line `AND pt.post_id = p.id`, right? hmm... why is that required? and that will be SQL or DQL? SQL right? – Jiew Meng Aug 04 '10 at 01:04