-1

Consider the following query:

SELECT * 
FROM product_related pr 
    LEFT JOIN product p ON (pr.related_id = p.product_id) 
    LEFT JOIN product_to_store p2s ON (p.product_id = p2s.product_id) 
WHERE 
pr.product_id = '" . (int)$product_id . "' AND 
p.status = '1' AND 
p.date_available <= NOW() AND 
p2s.store_id = '0'

Tables structure:

product_related - 2 int columns, both indexed. 10 partitions by HASH(product_id). 1.2m records. Total table size 30mb.

product - 48 various columns (excluding large text/blobs). Column product_id is primary and indexed. Columns status and date_available are also indexed. 10 partitions by HASH(product_id). 130k records. Total table size 61mb.

product_to_store - 2 int columns, index PRIMARY(product_id, store_id). No partitions. 130k records. Total table size 3.4mb.

Most of the times the query performs fast, under 0.05s. But occasionally it slows to 30..50s. If you visit the affected page for the second time (basically just hit F5 right away) - the query again performs fine under fractions of second.

I've met this behavior earlier, with different tables, on other websites, but I can't figure out where is the flaw.


EXPLAIN

Explain output


Version

Version


Stats

Stats


Delay

Delay


product

product


product_related

product_related


product_to_store

product_to_store


So question: how to optimize occasionally slow query where all indexes are set and tables are partitioned?

Roman K.
  • 69
  • 6
  • 3
    To help you with your [tag:query-optimization] question, we need to see your table definitions, your index definitions, and the output of EXPLAIN. Please [read this](https://stackoverflow.com/tags/query-optimization/info), then [edit] your question. – O. Jones Oct 31 '22 at 10:05
  • 2
    And, your tables have too few rows to require partitioning them. With proper indexing a 1.2 megarow table will perform much better, and certainly more predictably, without partitions than it will with partitions. It's possible the query you showed us is sometimes contending with another query that references all the partitions. – O. Jones Oct 31 '22 at 13:31
  • 1
    @Roman K. Please post TEXT results of A) SHOW GLOBAL VARIABLES LIKE '%stats%'; and B) SELECT @@version; and C) SHOW GLOBAL VARIABLES LIKE '%delay%'; for a couple ideas of WHY you might have occasional delays. – Wilson Hauck Nov 02 '22 at 16:38
  • Thanks guys, I've updated my question will all info you suggested – Roman K. Nov 03 '22 at 10:26

1 Answers1

0

Because of some of the WHERE clauses, those LEFT JOINs are turned into INNER JOIN. Please don't use LEFT unless you need it.

These may help:

pr:   INDEX(product_id, related_id)  -- unless pr has PRIMARY KEY(product_id)
p:    INDEX(status, product_id, date_available)
p2s:  INDEX(store_id, product_id)

If any of those tables are "many-to-many", see Many-to-many for the optimal INDEXes.

Please provide SHOW CREATE TABLE so I can check the datatypes and indexes.

I have never found a use for PARTITION BY HASH.

"2 int columns, both indexed" -- One composite index, or two separate indexes. (It may make a big difference.)

Rick James
  • 135,179
  • 13
  • 127
  • 222
  • Believe it or not, but I've seen such query performing slower with large product_id number in case where products were added consequently. So to my understanding in such case new data is being added at the end of the database file, and reading from the end is slower than from beginning of the file. And partitioning in this case helped me. I've updated my question, you'll find all info about indexes. – Roman K. Nov 03 '22 at 10:36
  • Reading from the end is slower if you don't have a suitable index. You seem to have so little data that partitioning would actually slow things down. – Rick James Nov 03 '22 at 19:11