Why order by clause can make use of index?

Question

Assume in tableX we have id(primary key) name and age, phone, all with indices.

In this query: select phone from tableX where name='Dennis' order by age

I guess the process is

Using name index to get ids that match Dennis. Denote the id set by S
Use age index to do order by on the ids obtained in 1, get a sorted id list, denoted by L
Use the sorted id list L to get phone

I assume in step 2 it may use a sequential scan along the B+ tree leaf nodes, checking whether the id in that leaf node is in the id set S obtained in step 1. If so, add it into list L, then we can get a id list L sorted by age.

But how is that better than simple sequential scan? Aren't they both sequential scan?

Edit:

explain says it uses index name and does a filesort

+----+-------------+--------+------------+------+---------------+----------+---------+-------+------+----------+----------------+
| id | select_type | table  | partitions | type | possible_keys | key      | key_len | ref   | rows | filtered | Extra          |
+----+-------------+--------+------------+------+---------------+----------+---------+-------+------+----------+----------------+
|  1 | SIMPLE      | tableX | NULL       | ref  | idx_name      | idx_name | 123     | const |    1 |   100.00 | Using filesort |
+----+-------------+--------+------------+------+---------------+----------+---------+-------+------+----------+----------------+

Actually I was not sure in what situation index can be useful in order by clause, so I came up with a bad example to illustrate my doubt.

But the example provided by Tim Biegeleisen is great.

(More table details if you are interested:)

mysql> create table tableX(
    -> id int primary key,
    -> name varchar(30),
    -> age int,
    -> phone varchar(30)
    -> );
Query OK, 0 rows affected (0.07 sec)

mysql> create index idx_name on tableX(name);
Query OK, 0 rows affected (0.05 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> create index idx_age on tableX(age);
Query OK, 0 rows affected (0.03 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> create index idx_phone on tableX(phone);
Query OK, 0 rows affected (0.03 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> show index from tableX;
+--------+------------+-----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+
| Table  | Non_unique | Key_name  | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | Visible | Expression |
+--------+------------+-----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+
| tableX |          0 | PRIMARY   |            1 | id          | A         |           1 |     NULL |   NULL |      | BTREE      |         |               | YES     | NULL       |
| tableX |          1 | idx_name  |            1 | name        | A         |           1 |     NULL |   NULL | YES  | BTREE      |         |               | YES     | NULL       |
| tableX |          1 | idx_age   |            1 | age         | A         |           1 |     NULL |   NULL | YES  | BTREE      |         |               | YES     | NULL       |
| tableX |          1 | idx_phone |            1 | phone       | A         |           1 |     NULL |   NULL | YES  | BTREE      |         |               | YES     | NULL       |
+--------+------------+-----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+
4 rows in set (0.01 sec)

mysql> select * from tableX;
+----+--------+------+-------+
| id | name   | age  | phone |
+----+--------+------+-------+
|  1 | Jack   |   20 | 180   |
|  2 | Dennis |   22 | 180   |
|  3 | Dennis |   18 | 1790  |
+----+--------+------+-------+

Don't assume, use explain to see the query plan. I would assume it will use the age index or the name index, but not both. — ysth, Dec 01 '22 at 05:34

score 3 · Accepted Answer · edited Dec 01 '22 at 17:47

3

Actually, the index which should help here is a compound one:

CREATE INDEX idx ON tableX (name, age, phone)

The above index, if used, would likely have the following steps:

The B-tree can be seeked searching for Dennis name records.
Once that region of the B-tree is located, the records will then be sorted by age
A scan can be performed to fill the result set with Dennis records, sorted by age
The index also includes the phone column making this a covering index for the query; all the columns it needs can be read directly from the index.

edited Dec 01 '22 at 17:47

Schwern

153,029
25
195
336

answered Dec 01 '22 at 03:31

Tim Biegeleisen

502,043
27
286
360

Is `phone` there to make it only have to read the index and not the table? – Schwern Dec 01 '22 at 07:51
@Schwern Yes, that is correct. `phone` is included to make it a covering index. On other databases, we might have other options. For example, on SQL Server, we can use `INCLUDE` to include the `phone` value only in the leaf node (without making the entire index larger). – Tim Biegeleisen Dec 01 '22 at 07:56

Why order by clause can make use of index?

1 Answers1