MySQL index on timestamp column not used for large date ranges

Question

I have table as

+-------------------+----------------+------+-----+---------------------+-----------------------------+
| Field             | Type           | Null | Key | Default             | Extra                       |
+-------------------+----------------+------+-----+---------------------+-----------------------------+
| id                | bigint(20)     | NO   | PRI | NULL                | auto_increment              |
| runtime_id        | bigint(20)     | NO   | MUL | NULL                |                             |
| place_id          | bigint(20)     | NO   | MUL | NULL                |                             |
| amended_timestamp | varchar(50)    | YES  |     | NULL                |                             |
| applicable_at     | timestamp      | NO   |     | CURRENT_TIMESTAMP   | on update CURRENT_TIMESTAMP |
| schedule_time     | timestamp      | NO   | MUL | 0000-00-00 00:00:00 |                             |
| quality_indicator | varchar(10)    | NO   |     | NULL                |                             |
| flow_rate         | decimal(15,10) | NO   |     | NULL                |                             |
+-------------------+----------------+------+-----+---------------------+-----------------------------+

I have index on schedule_time as

create index table_index on table(schedule_time asc);

The table currently has 2121552+ records.

The thing I fail to understand is when I do explain

explain select runtime_id from table where schedule_time >= now() - INTERVAL 1 DAY;
+----+-------------+----------+-------+------------------------------+------------------------------+---------+------+-------+-------------+
| id | select_type | table    | type  | possible_keys                | key                          | key_len | ref  | rows  | Extra       |
+----+-------------+----------+-------+------------------------------+------------------------------+---------+------+-------+-------------+
|  1 | SIMPLE      | table    | range | table_index                  | table_index                  | 4       | NULL | 38088 | Using where |
+----+-------------+----------+-------+------------------------------+------------------------------+---------+------+-------+-------------+
1 row in set (0.00 sec)

Above index is used, but the below one not.

mysql> explain select runtime_id from table where schedule_time >= now() - INTERVAL 30 DAY;
+----+-------------+----------+------+------------------------------+------+---------+------+---------+-------------+
| id | select_type | table    | type | possible_keys                | key  | key_len | ref  | rows    | Extra       |
+----+-------------+----------+------+------------------------------+------+---------+------+---------+-------------+
|  1 | SIMPLE      | table    | ALL  | table_index                  | NULL | NULL    | NULL | 2118107 | Using where |
+----+-------------+----------+------+------------------------------+------+---------+------+---------+-------------+
1 row in set (0.00 sec)

I'll really appreciate if someone can point out whats wrong here, as the data is updated every 12 minutes and as the time passes by query for 30 days or may be 60 days will get very slow.

The final query where I plan to use it is as follows

select avg(flow_rate),c.group from table a ,(select runtime_id from table where schedule_time >= now() - INTERVAL 1 DAY group by schedule_time ) b,place c  where a.runtime_id = b.runtime_id and a.place_id = c.id group by c.group;

Update =====>

As per the comments between fails too.

mysql> explain select runtime_id from table where schedule_time between '2013-07-17 12:48:00' and '2013-08-17 12:48:00';
+----+-------------+----------+------+------------------------------+------+---------+------+---------+-------------+
| id | select_type | table    | type | possible_keys                | key  | key_len | ref  | rows    | Extra       |
+----+-------------+----------+------+------------------------------+------+---------+------+---------+-------------+
|  1 | SIMPLE      | table    | ALL  | table_index                  | NULL | NULL    | NULL | 2118431 | Using where |
+----+-------------+----------+------+------------------------------+------+---------+------+---------+-------------+
1 row in set (0.00 sec)

mysql> explain select runtime_id from table where schedule_time between '2013-08-16 12:48:00' and '2013-08-17 12:48:00';
+----+-------------+----------+-------+------------------------------+------------------------------+---------+------+-------+-------------+
| id | select_type | table    | type  | possible_keys                | key                          | key_len | ref  | rows  | Extra       |
+----+-------------+----------+-------+------------------------------+------------------------------+---------+------+-------+-------------+
|  1 | SIMPLE      | table    | range | table_index                  | table_index                  | 4       | NULL | 38770 | Using where |
+----+-------------+----------+-------+------------------------------+------------------------------+---------+------+-------+-------------+
1 row in set (0.00 sec)

Update 2 =======>

mysql> select count(*) from table where schedule_time between '2013-08-16 12:48:00' and '2013-08-17 12:48:00';
+----------+
| count(*) |
+----------+
|    19440 |
+----------+
1 row in set (0.01 sec)

mysql> select count(*) from table where schedule_time between '2013-07-17 12:48:00' and '2013-08-17 12:48:00';
+----------+
| count(*) |
+----------+
|   597132 |
+----------+
1 row in set (0.00 sec)

Server version: 5.5.24-0ubuntu0.12.04.1 (Ubuntu)

Similar question here - the idea is to use the 'between' statement : http://stackoverflow.com/questions/2041575/mysql-query-records-between-today-and-last-30-days — FreudianSlip, Aug 18 '13 at 09:15
I wasn't able to reproduce that behavior (not trying that hard, thought...) Just a wild guess: does running `ANALYZE TABLE my_table` improve things? — Sylvain Leroux, Aug 18 '13 at 09:26
@SylvainLeroux I ran ANALYZE TABLE but same result with both between and Interval — baba.kabira, Aug 18 '13 at 09:29
The really puzzling thing if `EXPLAIN` reporting `ALL` as *join type* (in the column `type`)... As a *possible* workaround, could you try: `SET @d = now() - INTERVAL 30 DAY; SELECT .... where schedule_time >= @d` — Sylvain Leroux, Aug 18 '13 at 09:36
These explains say that the table has 38770 rows for 1 day period, and 2118431 for 1 month period. Is that true ? Please run: `select count(*) from table where schedule_time between '2013-07-17 12:48:00' and '2013-08-17 12:48:00';` and `select count(*) from table where schedule_time between '2013-07-17 12:48:00' and '2013-08-17 12:48:00';` and paste results here. — krokodilko, Aug 18 '13 at 11:25
Thank you for posting results. One question - you wrote in the question that the table has 21K rows, but the query returned almost 600 K rows, so it's not a truth. How many rows the whole table has ? — krokodilko, Aug 18 '13 at 13:26
@kordirko sorry about confusion that was simple info, my bad , correct result at present is 2121552, the table is live keeps updating every 12 mins. — baba.kabira, Aug 18 '13 at 13:33
I found this link http://bugs.mysql.com/bug.php?id=64998 I guess there is a bug in mysql version 5.5.24 — baba.kabira, Aug 18 '13 at 13:38
600 k / 2100 k = 28% of the whole table, I think that the optimiser probably makes a good decision preffering the full table scan over the index, a sequential scanning of the whole table is probably faster than a random access using the index in this case, you may try a `force index hint` - http://dev.mysql.com/doc/refman/5.6/en/index-hints.html, this should force MySql to use that index, and then compare query times with FTS and INDEX. — krokodilko, Aug 18 '13 at 13:49
@kordirko with force hint explain shows usage of index with rows scanned 1061213, the result with force index(avg. .81s) is faster than without it(avg. 1.81s) for 30 days but if I go for 365 days which is all the data as I don't have more than a year of data, the query without index is 2x faster. — baba.kabira, Aug 18 '13 at 14:13

score 5 · Answer 1 · edited Jul 05 '21 at 11:25

The MySQL optimizer tries to do the fastest thing. Where it thinks that using the index will take as long or longer than doing a table scan, it abandons the available index.

This is what you see it doing in your examples:

where the range is small (1 day) the index will be faster;
where the range is large, you're going to be hitting so much more of the table you might as well scan the table directly (remember, using the index involves searching the index and then grabbing the indexed records from the table - two sets of seeks).

If you think you know better than the optimizer (it isn't perfect), use INDEX hints:

The USE INDEX (index_list) hint tells MySQL to use only one of the named indexes to find rows in the table. The alternative syntax IGNORE INDEX (index_list) tells MySQL to not use some particular index or indexes. These hints are useful if EXPLAIN shows that MySQL is using the wrong index from the list of possible indexes.

MySQL index on timestamp column not used for large date ranges

1 Answers1