0

Have found an inefficient query in our system. content holds versions of slides, and this is supposed to select the highest version of a slide by id.

SELECT `content`.*
FROM (`content`)
JOIN (
SELECT max(version) as `version` from `content`
WHERE `slide_id` = '16901'
group by `slide_id`
) c ON `c`.`version` = `content`.`version`;

EXPLAIN

+----+-------------+------------------+------------+--------+--------------------------------------------------------------------------------+------------------------------------+---------+-------+------+----------+--------------------------+
| id | select_type | table            | partitions | type   | possible_keys                                                                  | key                                | key_len | ref   | rows | filtered | Extra                    |
+----+-------------+------------------+------------+--------+--------------------------------------------------------------------------------+------------------------------------+---------+-------+------+----------+--------------------------+
|  1 | PRIMARY     | <derived2>       | NULL       | system | NULL                                                                           | NULL                               | NULL    | NULL  |    1 |   100.00 | NULL                     |
|  1 | PRIMARY     | content          | NULL       | ref    | PRIMARY,version                                                                | PRIMARY                            | 8       | const | 9703 |   100.00 | NULL                     |
|  2 | DERIVED     | content          | NULL       | ref    | PRIMARY,fk_content_slides_idx,thumbnail_asset_id,version,slide_id                      | fk_content_slides_idx              | 8       | const |    1 |   100.00 | Using where; Using index |
+----+-------------+------------------+------------+--------+--------------------------------------------------------------------------------+------------------------------------+---------+-------+------+----------+--------------------------+

One big issue is that it returns almost all the slides in the system as the outer query does not filter by slide id. After adding that I get...

SELECT `content`.* 
FROM (`content`) 
JOIN ( 
SELECT max(version) as `version` from `content`  
WHERE `slide_id` = '16901' group by `slide_id` 
) c ON `c`.`version` = `content`.`version` 
WHERE `slide_id` = '16901';

EXPLAIN

+----+-------------+------------------+------------+--------+--------------------------------------------------------------------------------+------------------------------------+---------+-------------+------+----------+--------------------------+
| id | select_type | table            | partitions | type   | possible_keys                                                                  | key                                | key_len | ref         | rows | filtered | Extra                    |
+----+-------------+------------------+------------+--------+--------------------------------------------------------------------------------+------------------------------------+---------+-------------+------+----------+--------------------------+
|  1 | PRIMARY     | <derived2>       | NULL       | system | NULL                                                                           | NULL                               | NULL    | NULL        |    1 |   100.00 | NULL                     |
|  1 | PRIMARY     | content          | NULL       | const  | PRIMARY,fk_content_slides_idx,version,slide_id                                 | PRIMARY                            | 16      | const,const |    1 |   100.00 | NULL                     |
|  2 | DERIVED     | content          | NULL       | ref    | PRIMARY,fk_content_slides_idx,thumbnail_asset_id,version,slide_id              | fk_content_slides_idx              | 8       | const       |    1 |   100.00 | Using where; Using index |
+----+-------------+------------------+------------+--------+--------------------------------------------------------------------------------+------------------------------------+---------+-------------+------+----------+--------------------------+

That reduces the amount of rows down to one correctly, but doesnt really speed things up.

There are indexes on version, slide_id and a unique key on version AND slide_id.

Is there anything else I can do to speed this up?

Use a TOP LIMIT 1 insetead of Max ?

m

Matt Bryson
  • 2,286
  • 2
  • 22
  • 42

2 Answers2

0

Just a suggestion i think you should avoid the group by slide_id because you are filter by one slide_id only (16901)

SELECT `content`.* 
FROM (`content`) 
JOIN ( 
SELECT max(version) as `version` from `content`  
   WHERE `slide_id` = '16901' 
) c ON `c`.`version` = `content`.`version` 
WHERE `slide_id` = '16901';
ScaisEdge
  • 131,976
  • 10
  • 91
  • 107
0

MySQL seems to take an index (version, slide_id) to join the tables. You should get a better result with

SELECT `content`.* 
FROM `content`
FORCE INDEX FOR JOIN (fk_content_slides_idx) 
join (    
    SELECT `slide_id`, max(version) as `version` from `content`  
    WHERE `slide_id` = '16901' group by `slide_id` 
) c ON `c`.`slide_id` = `content`.`slide_id` and `c`.`version` = `content`.`version` 

You need an index that has slide_id as first column, I just guessed that's fk_content_slides_idx, if not, take another one.

The part FORCE INDEX FOR JOIN (fk_content_slides_idx) is just to enforce it, you should try if mysql takes it by itself without forcing (it should).

You might get even a slightly better result with an index (slide_id, version), it depends on the amount of data (e.g. the number of versions per id) if you see a difference (but you should not spam indexes, and you already have a lot on this table, but you can try it for fun.)

Solarflare
  • 10,721
  • 2
  • 18
  • 35