3

Example dataset:

id | tag
---|------
1  | car
1  | bike
2  | boat
2  | bike
3  | plane
3  | car

id and tag are both indexed.

I am trying to get the id who matches the tags [car, bike] (the number of tags can vary).

A naive query to do so would be:

SELECT id
FROM test
WHERE tag = 'car'
    OR tag =  'bike'
GROUP BY id
HAVING COUNT(*) = 2

However, doing so is quite inefficient because of the group by and the fact that any line that match one tag is taken into account for the group by (and I have a large volumetry).

Is there a more efficient query for this situation?

The only solution I see would be to have another table containing something like:

id | hash
---|------
1  | car,bike
2  | boat,bike
3  | plane,car

But this is not an easy solution to implement and maintain up to date.

Additional infos:

  • the name matching must be exact (no fulltext index)
  • the number of tags is not always 2
Matthieu Napoli
  • 48,448
  • 45
  • 173
  • 261

3 Answers3

0

try this:

SELECT id
FROM test
WHERE tag in('car','bike')
GROUP BY id
HAVING COUNT(*) = 2

And create a nonclustered index on tag column

AnandPhadke
  • 13,160
  • 5
  • 26
  • 33
0

Here you go:

select id from TEST where tag = 'car' and ID in (select id from TEST where tag='bike')
Romo
  • 222
  • 4
  • 11
  • From the OP: "the number of tags is not always 2" – Tim Lehner Oct 08 '12 at 14:18
  • Yes, but you can extend the query with more "ID in". You already have to make some kind of "knowing how many things to search for" in the query. And in this example you can use the clean index without any group by and having count. – Romo Oct 08 '12 at 14:36
  • Okay to vote it down, but please analyse the query against the other examples in the answers. You will se it is way faster. And no matter how you do it, you still have to "build" the query on how many "tag" (car, bike an so on) there are in the query. – Romo Oct 09 '12 at 08:20
-1

not sure if I get you, but try this:

select tag, count(*)  as amount
into #temp
from MYTABLE
group by tag


select t1.tag 
from #temp t1 join #temp t2 on t1.amount=t2.amount and t1.tag=t2.tag and t1.amount=2

should result bike and car since they both have 2 rows, whihc is equal to 2

Diego
  • 34,802
  • 21
  • 91
  • 134