2

I have a query that takes tag Ids obtained from a previous query, and selects post ids from a post_tag_map table.

$tagIds for example can be something like "3,4,23,54".

'SELECT post_id FROM post_tag_map WHERE tag_id IN (' . $tagIds . ') ORDER BY `post_id`;'

This query works fine for its purpose, but it returns any post_id that has any one of those tag_ids.

I'm trying to obtain only post_ids that contain ALL of those tag_ids.

For example, if I search for "news, politics, campaign", I only want to return posts that have been tagged with all three of those tags, not any post that has one of the three.

How can I alter my query?

Edit: By request, my table structure is 3 tables. One (post) with post_id and post information, one (tag) with tag_id and tag_name, and post_tag_map, just maps the two Ids together.

dcclassics
  • 896
  • 1
  • 12
  • 38

1 Answers1

6

Well it's going to be a bit more complex

Keep with your IN thing but also group posts to find ones with exact match

WHERE tag_id IN (3,4,23,54) GROUP BY `post_id` HAVING count(1) = 4;

where 4 is a number of tags we're looking for

Your Common Sense
  • 156,878
  • 40
  • 214
  • 345
  • So I can just do a count on the number of items in the array and use that as count(1) = $count? – dcclassics Apr 04 '14 at 18:55
  • yes. I am writing constant numbers because I am reluctant to write variables right in the query. And you shouldn't use any variables in the query as well – Your Common Sense Apr 04 '14 at 18:56
  • 1
    +1 : out of box thinking... I would have written 5 AND statements :D – Fahim Parkar Apr 04 '14 at 19:01
  • This works great. I did put a variable in there at the end based off of `$tagCount = count($tagIds);`. Is there a serious security concern here? – dcclassics Apr 04 '14 at 19:05
  • @Kevin well, for a count result it's obviously not... But it should be a habit, a reflex, a taboo, an inviolable rule - to put any data into query via placeholder only – Your Common Sense Apr 04 '14 at 19:07
  • Yes, I make it a common practice to use prepared statements whenever possible, explicitly for any user entered data. For these queries, they're getting results using values directly resultant from another query. – dcclassics Apr 04 '14 at 19:08
  • 1
    @Kevin strictly speaking, it's exact case for the second order SQL injection. I know these values are from int field... but again, I prefer a rule that never gets violated. – Your Common Sense Apr 04 '14 at 19:12
  • @YourCommonSense can I ask an additional question? This may be going backwards on my previous logic. But let's say the user enters "guitar" "acoustic" "taylor", but "taylor" doesn't actually exist in the tag database. Is there an easy way to ignore those? – dcclassics Apr 08 '14 at 18:06
  • @KevinO'Brien it should be ignored by default – Your Common Sense Apr 08 '14 at 18:13
  • @YourCommonSense Ooops you're right! I was submitting a query with zero existing tags. So of course it returned an error! Okay thanks. – dcclassics Apr 08 '14 at 18:14
  • @KevinO'Brien you may use NULL id array is empty. Or just make a condition and run no query at all – Your Common Sense Apr 08 '14 at 18:16