0

I am having troubles with a particular query being slow. Although everything is heavily indexed, some similar queries working fine and the indexes are used, the query still is slow as hell. I cannot understand why, so maybe anybody can help.

Just for the prerequisites: the write speed of the underlying table does not matter. The table contains ~3.5 million entries but I guess MySQL should handle that just fine.

The query that is being slow takes about 2s

 SELECT DISTINCT t.`tag_3`  FROM `image_tags` t
     WHERE  t.`type` = 1 AND t.`category` LIKE "00%" AND tag_1 = "0"

--- DESCRIBE OUTPUT
--- The used index thirdtag is just an index defined as (type, category, tag_1, tag_3)
--- The actual result is 201 rows
+----+-------------+-------+- -----------------------+----------+---------+------+---------+-------------------------------------------+
| id | select_type | table | type  | possible_keys   | key      | key_len | ref  | rows    | Extra                                     |
+----+-------------+-------+-------+-----------------+----------+---------+------+---------+-------------------------------------------+
|  1 | SIMPLE      | t     | range | [... A LOT ...] | thirdtag | 31      | NULL | 1652861 | Using where; Using index; Using temporary |
+----+-------------+-------+-------+-----------------+----------+---------+------+---------+-------------------------------------------+

The only thing that's standing out is the enormous amount of rows involved. If you compare with the 2 fast queries I attached to the end of this question it is literally the only thing different (at least from the first one). So most probably that's the problem. But that's how the data is given to me so I need to work with that. I thought if involved in the index mysql could handle the data just fine.

Does anybody have a suggestion how to optimize the query? Any suggestions if i could use different indexes that suit more to the query?

For comparison these 2 similar queries work blazing fast

 --- just a longer category string resulting in fewer results
 SELECT DISTINCT t.`tag_3`  FROM `image_tags` t
     WHERE  t.`type` = 1 AND t.`category` LIKE "0000%" AND tag_1 = "0"

 --- and additional where clause
 SELECT DISTINCT t.`tag_3`  FROM `image_tags` t
     WHERE  t.`type` = 1 AND t.`category` LIKE "00%" AND tag_1 = "0" and tag_2 = ""

The table (it has a lot of indexes probably too long to paste).

+----------+------------------+------+-----+---------+----------------+
| Field    | Type             | Null | Key | Default | Extra          |
+----------+------------------+------+-----+---------+----------------+
| id       | int(10) unsigned | NO   | PRI | NULL    | auto_increment |
| image    | char(8)          | NO   | MUL | NULL    |                |
| category | varchar(6)       | YES  | MUL | NULL    |                |
| type     | tinyint(1)       | NO   | MUL | NULL    |                |
| tag_1    | char(3)          | NO   | MUL | NULL    |                |
| tag_2    | char(3)          | NO   | MUL | NULL    |                |
| tag_3    | char(3)          | NO   | MUL | NULL    |                |
| tag_4    | char(3)          | NO   | MUL | NULL    |                |
| tag_5    | char(3)          | NO   | MUL | NULL    |                |
| tag_6    | char(3)          | NO   | MUL | NULL    |                |
+----------+------------------+------+-----+---------+----------------+
halfer
  • 19,824
  • 17
  • 99
  • 186
patman
  • 2,780
  • 4
  • 30
  • 54
  • 1
    Do you have separate index on tag_3? I think that query would be faster if you create composite index on (tag_1, category, type, tag_3) in this order. Then DISTINCT can get result from index without taking it from a table. – user1209304 Jan 08 '16 at 13:13
  • Yes i have a second index on the tag_3 column. But i will try to reorder. Did play with the index order a little bit but did only get worse results. Don't know if i tried your order though – patman Jan 08 '16 at 13:53

1 Answers1

2

Please provide SHOW CREATE TABLE, it is more descriptive than DESCRIBE! In particular, I cannot see what indexes you have.

As My index cookbook explains, start the index with any fields that are '=', then you get one chance to add a 'range' comparison. Your category is a range, so

WHERE  t.`type` = 1 AND t.`category` LIKE "00%" AND tag_1 = "0"

does not get past category in

INDEX(type, category, tag_1, tag_3)

For your 3 queries, these are the best indexes:

INDEX(type, tag_1, category)
INDEX(type, tag_1, category)
INDEX(type, tag_1, tag_2, category)

category should be last; the other columns can be in any order. Perhaps some one of your indexes handled the 3rd case?

it has a lot of indexes probably too long to paste

Probably most of them are unused. Keep in mind that INDEX(a) is unnecessary if you also have INDEX(a,b).

Rick James
  • 135,179
  • 13
  • 127
  • 222
  • +1 Good points. Including `tag_3` as a fourth column i.e. `(type,tag_1,category,tag_3)` would create a *covering index* for the query. With that index, we'd expect EXPLAIN output to show "Using index" in the Extra column. That means the query can be satisfied entirely from the index pages, without requiring any lookups to the data pages in the underlying table. (A covering index is likely to be a better index for this query.) Also, it might make sense to change the order of the columns `type` and `tag_1` in the index. That might make a small difference, depending on the cardinality. – spencer7593 Jan 08 '16 at 22:10
  • There's also a possibility that relocating `tag_3` before `category` in the index would enable MySQL to avoid a "sort unique" operation, which is otherwise required to get the "distinct" list. e.g. `(tag_1,type,tag_3,category)`. If the execution plan uses that index, and the EXPLAIN output still shows "Using filesort" in the `Extra` column, then that index would be pointless (for this query.) But if the execution plan avoids a "Using filesort" operation, then this might turn out to be a better index for the query. – spencer7593 Jan 08 '16 at 22:23