MySql refuses to use index

Question

I'm new to query optimizations so I accept I don't understand everything yet but I do not understand why even this simple query isn't optimized as expected.

My table:

+------------------+-----------+------+-----+-------------------+----------------+
| Field            | Type      | Null | Key | Default           | Extra          |
+------------------+-----------+------+-----+-------------------+----------------+
| tasktransitionid | int(11)   | NO   | PRI | NULL              | auto_increment |
| taskid           | int(11)   | NO   | MUL | NULL              |                |
| transitiondate   | timestamp | NO   | MUL | CURRENT_TIMESTAMP |                |
+------------------+-----------+------+-----+-------------------+----------------+

My indexes:

+-----------------+------------+-------------------+--------------+------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table           | Non_unique | Key_name          | Seq_in_index | Column_name      | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-----------------+------------+-------------------+--------------+------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| tasktransitions |          0 | PRIMARY           |            1 | tasktransitionid | A         |         952 |     NULL | NULL   |      | BTREE      |         |               |
| tasktransitions |          1 | transitiondate_ix |            1 | transitiondate   | A         |         952 |     NULL | NULL   |      | BTREE      |         |               |
+-----------------+------------+-------------------+--------------+------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+

My query:

SELECT taskid FROM tasktransitions WHERE transitiondate>'2013-09-31 00:00:00';

gives this:

+----+-------------+-----------------+------+-------------------+------+---------+------+------+-------------+
| id | select_type | table           | type | possible_keys     | key  | key_len | ref  | rows | Extra       |
+----+-------------+-----------------+------+-------------------+------+---------+------+------+-------------+
|  1 | SIMPLE      | tasktransitions | ALL  | transitiondate_ix | NULL | NULL    | NULL | 1082 | Using where |
+----+-------------+-----------------+------+-------------------+------+---------+------+------+-------------+

If I understand everything correctly Using where and ALL means that all rows are retrieved from the storage engine and filtered at server layer. This is sub-optimal. Why does it refuse to use the index and only retrieve the requested range from the storage engine (innoDB)?

Cheers

@FreshPrinceOfSO No, but why would I have to? I expect the preprocessor to be smart enough given the high cardinality to select the index automatically. — Tin, Nov 04 '13 at 17:52
why is the column `transitiondate` appearing with key = `MUL` in table definition? — Sebas, Nov 04 '13 at 18:02
@Sebas The date is not unique. Multiple task transitions might occur simultaneously. — Tin, Nov 04 '13 at 18:09

score 10 · Accepted Answer · answered Nov 04 '13 at 17:55

10

MySQL will not use the index if it estimates that it would select a significantly large portion of the table, and it thinks that a table-scan is actually more efficient in those cases.

By analogy, this is the reason the index of a book doesn't contain very common words like "the" -- because it would be a waste of time to look up the word in the index and find the list of page numbers is a very long list, even every page in the book. It would be more efficient to simply read the book cover to cover.

My experience is that this happens in MySQL if a query's search criteria would match greater than 20% of the table, and this is usually the right crossover point. There could be some variation based on the data types, size of table, etc.

You can give a hint to MySQL to convince it that a table-scan would be prohibitively expensive, so it would be much more likely to use the index. This is not usually necessary, but you can do it like this:

SELECT taskid FROM tasktransitions FORCE INDEX (transitiondate_ix)
WHERE transitiondate>'2013-09-31 00:00:00';

answered Nov 04 '13 at 17:55

Bill Karwin

538,548
86
673
828

Thanks @Bill, this seems indeed to be the reason. The query returns 186 rows out of 987 rows. When I change the date such that only 10 rows match, it uses the index. What I still don't understand is that `using where` is in explain's output when the key is being used. – Tin Nov 04 '13 at 18:07
2

`Using where` simply means that the SQL execution is applying a condition to filter out some rows. In other words, not all rows examined are returned. – Bill Karwin Nov 04 '13 at 18:11
I'm reading the book High Performance MySql from Baron Schwartz. According to this book `Using where` means that too many rows are read from the storage engine and a better suited index will probably solve this. But I don't understand how to. – Tin Nov 04 '13 at 18:30
To be correct: sometimes a better suited index is available when `using where` is reported. – Tin Nov 04 '13 at 18:33
In your example you're searching only a single column, and it's already indexed. So there's no other opportunity to add an index that would be relevant. You're just searching for a range of dates that encompasses a significant part of the table. – Bill Karwin Nov 04 '13 at 18:34
Also read the description of `Using where` in http://dev.mysql.com/doc/refman/5.6/en/explain-output.html – Bill Karwin Nov 04 '13 at 18:34

score 1 · Answer 2 · answered Feb 21 '22 at 19:40

I once was trying to join two tables and MySQL was refusing to use an index, resulting in >500ms queries, sometimes a few seconds. Turns out the column I was joining on had different encodings on each table. Changing both to the same encoding sped up the query to consistently less than 100ms.

score 0 · Answer 3 · edited Oct 15 '20 at 13:32

0

Just in case, it helps somebody.

I have a table with a varchar column _id (long int coded as string). I added an index for this column, but query was still slow. I was executing this query:

select * from table where (_id = 2221835089) limit 1

I realized that the _id column wasn't been generated as string (I'm Laravel as DB framework). Well, if query is executed with the right data type in the where clause everything worked like a charm:

select * from table where (_id = '2221835089') limit 1

edited Oct 15 '20 at 13:32

Dharman

30,962
25
85
135

answered Oct 15 '20 at 07:12

vegetable

313
3
13

Thank you. This did indeed help with a seemingly trivial query refusing to use an index - even if FORCEd. – DaveW Oct 14 '22 at 17:22

score 0 · Answer 4 · edited Apr 01 '21 at 21:25

I am new at my MySQL 8.0, have finished 2 simple tutorials completely, and there is only two subjects that has not worked for me, one of them is indexing. I read the section labeled "2 Answers" and found that using the statement suggested at the end of said section, seems to defeat the purpose of the original USE INDEX or FORCE INDEX statement below. The suggested statement is like getting a table sorted via a WHERE statement instead of MySQL using USE INDEX or FORCE INDEX. It works, but seems to me it is not the same as using the natural USE INDEX or FORCE INDEX. Does any one knows why MySQL is ignoring my simple request to index a 10 row table on the Lname column?

Field	Type	Null	Key	Default	Extra
ID	int	NO	PRI	Null	auto_increment
Lname	varchar(20)	NO	MUL	Null
Fname	varchar(20)	NO	Mul	Null
City	varchar(15)	NO		Null
Birth_Date	date	NO		Null

CREATE INDEX idx_Lname ON TestTable (Lname);
SELECT * FROM TestTable USE INDEX (idx_Lname);
SELECT * From Testtable FORCE INDEX (idx_LastFirst);

The last two statements fail because both get ignored and I get a table back unsorted. — David Hernandez, Apr 01 '21 at 13:59

MySql refuses to use index

4 Answers4

Linked