22

I have a query (with the purpose of making a view) which is using a few joins to get each column. Performance degrades quickly (exponentially?) for each set of joins added.

What would be a good approach to make this query faster? Please see comments within the query.

If it helps, this is using the WordPress DB schema.

Here is a screenshot of EXPLAIN enter image description here

PRODUCTS TABLE

+--+----+
|id|name|
+--+----+
|1 |test|
+--+----+

METADATA TABLE

+----------+--------+-----+
|product_id|meta_key|value|
+----------+--------+-----+
|1         |price   |9.99 |
+----------+--------+-----+
|1         |sku     |ABC  |
+----------+--------+-----+

TERM_RELATIONSHIPS TABLE

+---------+----------------+
|object_id|term_taxonomy_id|
+---------+----------------+
|1        |1               |
+---------+----------------+
|1        |2               |
+---------+----------------+

TERM_TAXONOMY TABLE

+----------------+-------+--------+
|term_taxonomy_id|term_id|taxonomy|
+----------------+-------+--------+
|1               |1      |size    |
+----------------+-------+--------+
|2               |2      |stock   |
+----------------+-------+--------+

TERMS TABLE

+-------+-----+
|term_id|name |
+-------+-----+
|1      |500mg|
+-------+-----+
|2      |10   |
+-------+-----+

QUERY

SELECT 
  products.id,
  products.name,
  price.value AS price,
  sku.value AS sku,
  size.name AS size
FROM products

/* These joins are performing quickly */

INNER JOIN `metadata` AS price ON products.id = price.product_id AND price.meta_key = 'price'
INNER JOIN `metadata` AS sku ON products.id = sku.product_id AND sku.meta_key = 'sku'

/* Here's the part that is really slowing it down - I run this chunk about 5 times with different strings to match */

INNER JOIN `term_relationships` AS tr ON products.id = tr.object_id
  INNER JOIN `term_taxonomy` AS tt
  ON tr.term_taxonomy_id = tt.term_taxonomy_id AND tt.taxonomy = 'size'
    INNER JOIN `terms` AS size
    ON tt.term_id = size.term_id
DoNotArrestMe
  • 1,285
  • 1
  • 9
  • 20
dloewen
  • 975
  • 1
  • 10
  • 26
  • 2
    `OR` can kill index usage, try to rewrite using a union. – KM. Jan 03 '14 at 18:33
  • 1
    show the output of `DESC tableName` for each table. – Manu Jan 03 '14 at 18:51
  • or atleast output of `EXPLAIN ;` – Manu Jan 03 '14 at 18:52
  • @Manu I added an image of the EXPLAIN results to the question. Thanks. – dloewen Jan 03 '14 at 19:37
  • Also, I tried re-writing using subqueries for each column, but then PHPMyAdmin gives me "Commands out of sync; you can't run this command now"... – dloewen Jan 03 '14 at 20:09
  • @dloewen, I must have been seeing things, I remember your code having an "OR" it in it, oh well just ignore my previous comment. – KM. Jan 03 '14 at 21:00
  • it _seems_ like the `products.meta_key` nor the `term_taxonomy.taxonomy` fields are not indexed. Could you please confirm that they are? – Luis Gonzalez Jan 09 '14 at 21:43
  • KM & dloewen - so please clean up your obsolete comments.. – Tomas Jan 09 '14 at 22:14
  • dloewen, are you sure the explain is of the same query as you present? Explain says "using where", but your query doesn't contain where clause! **It is important that you provide us the exactly the same query!** – Tomas Jan 09 '14 at 22:23
  • 2
    **The explain for sure is of a different query! Tables tt2, tt3 are not present. -1 until you fix.** – Tomas Jan 09 '14 at 22:32
  • Have a index on `taxonomy` column and put the `tt.taxonomy = 'size'` condition before inner join so that you can get benefit from the short circuit operation. `tt.taxonomy = 'size' AND tr.term_taxonomy_id = tt.term_taxonomy_id`. PLease provide proper exaplain statement of your given query. – Meherzad Jan 10 '14 at 09:38

7 Answers7

16

Your performance issue is most likely caused by the join with the 'term_taxonomy' table.
All other joins seems to use the primary key (where you probobly have working indexes on).

So my suggestion is to add a compound index on term_taxonomy_id and term_id (or if you must: taxonomy). Like this:

CREATE UNIQUE INDEX idx_term_taxonomy_id_taxonomy
ON term_taxonomy( term_taxonomy_id, taxonomy);

Hope this will help you.

carleson
  • 728
  • 1
  • 5
  • 14
  • 6
    This could work. +1. Also it would be a good idea to make a **compound** index for `metadata (product_id, mate_key)` because having a wide index diversity means there are more options for mysql optimizer to create a better execution plan. – Karolis Jan 11 '14 at 10:15
  • 1
    Bingo! Creating the compound index in the metadata table on product_id and meta_key made the query execute in under 1 second, a huge improvement from the 30+ seconds! – dloewen Jan 16 '14 at 01:03
  • @Karolis can you post your solution separately so I can give you the bounty? Thanks also carleson for the tip that lead to the solution. – dloewen Jan 16 '14 at 01:05
  • @dloewen Well, I think carleson gave you the right direction and my comment was just 2 cents in order to make his answer more comprehensive :-) – Karolis Jan 16 '14 at 23:53
2

Make Sure all the columns on which there is "ON" conditional statements is there, should be indexed. This will significantly improve the speed.

akkig
  • 116
  • 7
0

Try this:

SELECT p.id, p.name, MAX(CASE m.meta_key WHEN 'price' THEN m.value ELSE '' END) AS price, 
       MAX(CASE m.meta_key WHEN 'sku' THEN m.value ELSE '' END) AS sku, s.name AS size
FROM products p 
INNER JOIN `metadata` AS m ON p.id = m.product_id  
INNER JOIN `term_relationships` AS tr ON p.id = tr.object_id 
INNER JOIN `term_taxonomy` AS tt ON tr.term_taxonomy_id = tt.term_taxonomy_id AND tt.taxonomy = 'size'
INNER JOIN `terms` AS s ON tt.term_id = s.term_id
GROUP BY p.id;

If you still find that your query is slow then add the EXPLAIN plan of my query so I can find which columns needs INDEX.

Saharsh Shah
  • 28,687
  • 8
  • 48
  • 83
0
    Declare @query as NVARCHAR(MAX)
    set @query = ('SELECT 
    products.id,
    products.name,
    price.value AS price,
    sku.value AS sku,
    size.name AS size
    FROM products
    INNER JOIN metadata AS price ON products.id = price.product_id AND price.meta_key = price
    INNER JOIN metadata AS sku ON products.id = sku.product_id AND sku.meta_key = sku
    INNER JOIN term_relationships AS tr ON products.id = tr.object_id
    INNER JOIN term_taxonomy AS tt
    ON tr.term_taxonomy_id = tt.term_taxonomy_id AND tt.taxonomy = size
    INNER JOIN terms AS size
    ON tt.term_id = size.term_id
    into #t')

    exec(@query);
    select * from #t

I Hope the above way will reduce the time utilization, or creating a temporary table with all the fields you select and updating the temporary table by joining to the temporary table to all the other tables might also be effective, well i am not sure about it but Even I am waiting for your result as your question seems intresting

JB9
  • 49
  • 8
  • 4
    I can´t see the point of doing this ? The performance will gets even worse in this case, since you have the same query and then also uses a temp table. Temp tables Could speed up slow queries if you split up big queries into smaller sub-sets, but that is not the case here. – carleson Jan 11 '14 at 08:58
0

The below script is formatted as per SQL Server rules - You can change this as per MySQL rules and give it a try -

SELECT 
  P.id,
  P.name,
  PIVOT_METADATA.price,
  PIVOT_METADATA.sku,
  size.name AS size
FROM products P (NOLOCK)

INNER JOIN term_relationships AS tr (NOLOCK)
    ON P.id = tr.object_id

INNER JOIN term_taxonomy AS tt (NOLOCK)
    ON tr.term_taxonomy_id = tt.term_taxonomy_id AND tt.taxonomy = 'size'

INNER JOIN terms AS size (NOLOCK)
    ON tt.term_id = size.term_id

INNER JOIN METADATA (NOLOCK)
    PIVOT
    (
        MAX(value)
        FOR [meta_key] IN (price,sku)
    )AS PIVOT_METADATA
    ON P.id = PIVOT_METADATA.product_id

What I feel could be the bottleneck in your query - You are joining Metadata 2 times. Since there are 1-to-many relationships in your tables, the Metadata 2-join doesn't hurt but after that as you join more tables - the number of rows due to 1-to-many relationship increase - and hence the prformance drops.

What I've tried to achieve - I'm making sure that as many 1-to-1 relationships are fulfilled as possible. To do this, I've done a Pivot on Metadata adn made price & sku as columns. Now my product id shall have only one row in Metadata pivot. alos, I've made sure that I join this picot at the very end.

Give it a try. Please share the expected performance, number of records you have & also what performance you get with my asnwer.

Suyash Khandwe
  • 386
  • 3
  • 11
0

METADATA_TABLE and TERM_RELATIONSHIP_TABLE do not have any proimary key. When there are huge records in these tables your query performancy will be hit.

Checkpoints to increase your performance.

  1. All tables should have primary key. This is because rows in table will be physically sorted.
  2. For small and queries involving few tables keeping primary key in table would be enough. If you still wish to improve performance, create non-clustered-index for columns such as *object_Id field of term_relationships table* . Non-clustered index should be created for those columns in table which are taking part in join operation.

However, point to be noted is that, non-clustered index should be very less on those tables where multiple insert and updates are happening. This is not a simple question and can’t be answered only based on run time. There are other factors that affect the answer especially if environment where a stored procedure is running is heavily transactional.

You can find more here

gokul
  • 383
  • 1
  • 8
0

I would suggest those:

  • Consider reducing those joins from business level;
  • If not possible to do from "top"(business level), and the data is not for real time, I would suggest to prepare a memory table(I know the solution is not ideal). And select your data from memory table directly.

In my experience:

  • "joins" is the killer for performance, the bigger your data is, the more pain you will feel;
  • Try to get rid of joins, not try to improve query performance by keeping joins unless you have to. Usually I will try to fix those issues from "top" to "bottom"
  • The last suggestion is if all above don't work. I will consider "map/reduce + fulltext search", if that worth to do.

(Forgive me I didn't provide solution to improve your query performance.)

Joshua
  • 689
  • 7
  • 16