1

Using the following table structure:

Items (~20,000 records)

  • item_id

Properties (~30 records)

  • property_id

Item_properties (~40,000 records)

  • id
  • property_id
  • item_id

A user can choose to filter items by a number of fields in the items table itself, and can also choose any number of properties that the item must have. The search needs to choose an item with all properties, not just one of them. I'm currently using the format

SELECT item.field...
FROM items
INNER JOIN item_properties AS ip1 ON ip1.item_id=item.item_id and ip1.property_id=3
INNER JOIN item_properties AS ip2 ON ip2.item_id=item.item_id and ip2.property_id=4
INNER JOIN item_properties AS ip3 ON ip3.item_id=item.item_id and ip3.property_id=5
INNER JOIN item_properties AS ip4 ON ip4.item_id=item.item_id and ip4.property_id=6
etc...
WHERE item.something_else='words'
GROUP BY item_id

I have also tried, as a way of specifying search purely by WHERE rather than by JOIN

SELECT item.field...
FROM items
WHERE item.something_else='words'
and item_id IN (select item_id from item_properties where property_id=3)
and item_id IN (select item_id from item_properties where property_id=4)
and item_id IN (select item_id from item_properties where property_id=5)
and item_id IN (select item_id from item_properties where property_id=6)
etc...

However this approach seemed, if anything, to take even longer to query the set. With about 4 properties chosen query time is about 4-5s, much more and the queries tend to get killed or bring the MySQL server down altogether.

As far as I am aware all the _id fields are indexed on each table, being the primary keys of their respective tables too.

Are there ways to improve the query or might I need to limit the number of options that can be queried?

M1ke
  • 6,166
  • 4
  • 32
  • 50
  • 1
    INNER JOIN item_properties AS ip1 ON ip1.item_id=item.item_id and ip1.property_id IN(3,4,5,6) – Mihai Nov 18 '14 at 12:30
  • Thanks; that works to create an __OR__ style search, where an item must have at least one of the properties. I'd actually like the item to have _ALL_ the properties chosen. – M1ke Nov 18 '14 at 12:57

2 Answers2

1

Use post aggregation filtering if you want all property_id

SELECT item.field
FROM items
INNER JOIN item_properties AS ip1 ON ip1.item_id=item.item_id and  
and ip1.property_id IN(3,4,5,6)
WHERE item.something_else='words'
GROUP BY item.field
HAVING COUNT(DISTINCT property_id )=4

4 is the number of property_id IN(3,4,5,6)

Mihai
  • 26,325
  • 7
  • 66
  • 81
  • This seems to work well; query time isn't substantially altered using the `HAVING` and it's faster than adding more indexes to the multiple join or sub query approach. – M1ke Nov 18 '14 at 13:06
0

I think you just need an index on item_properties:

create index idx_item_properties_2 on item_properties(item_id, property_id)
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • Is that different functionally from the item_id and property_id already being indexed on that table? – M1ke Nov 18 '14 at 12:51
  • I've created that and it does speed up the multiple join approach used above. Still 4.5s for the largest possible query but significantly better than crashing out MySQL. I think for efficiency of the actual query I may go with the answer that uses `HAVING` but this is helpful. – M1ke Nov 18 '14 at 13:07
  • @M1ke . . . Why do you have the `group by item_id`? If you have no duplicate properties, then this should be unnecessary (and you might find that removing it gives you another performance boost). – Gordon Linoff Nov 18 '14 at 13:11
  • @M1ke . . . A composite index is different from two separate indexes on the columns. – Gordon Linoff Nov 18 '14 at 13:12
  • Wouldn't the group be necessary to prevent one row for each item/property combination? I.e. the same item information would be present for each property that the item had, if using the joins? – M1ke Nov 18 '14 at 13:30
  • @M1ke . . . If there are no duplicate matches in the `join`, then you will get only one row per item. And, you will only get a row for an item when all joins succeed (because you are using `inner join`). – Gordon Linoff Nov 18 '14 at 14:49