2

Assume in tableX we have id(primary key) name and age, phone, all with indices.

In this query: select phone from tableX where name='Dennis' order by age

I guess the process is

  1. Using name index to get ids that match Dennis. Denote the id set by S

  2. Use age index to do order by on the ids obtained in 1, get a sorted id list, denoted by L

  3. Use the sorted id list L to get phone

I assume in step 2 it may use a sequential scan along the B+ tree leaf nodes, checking whether the id in that leaf node is in the id set S obtained in step 1. If so, add it into list L, then we can get a id list L sorted by age.

But how is that better than simple sequential scan? Aren't they both sequential scan?


Edit:

explain says it uses index name and does a filesort

+----+-------------+--------+------------+------+---------------+----------+---------+-------+------+----------+----------------+
| id | select_type | table  | partitions | type | possible_keys | key      | key_len | ref   | rows | filtered | Extra          |
+----+-------------+--------+------------+------+---------------+----------+---------+-------+------+----------+----------------+
|  1 | SIMPLE      | tableX | NULL       | ref  | idx_name      | idx_name | 123     | const |    1 |   100.00 | Using filesort |
+----+-------------+--------+------------+------+---------------+----------+---------+-------+------+----------+----------------+

Actually I was not sure in what situation index can be useful in order by clause, so I came up with a bad example to illustrate my doubt.

But the example provided by Tim Biegeleisen is great.


(More table details if you are interested:)

mysql> create table tableX(
    -> id int primary key,
    -> name varchar(30),
    -> age int,
    -> phone varchar(30)
    -> );
Query OK, 0 rows affected (0.07 sec)

mysql> create index idx_name on tableX(name);
Query OK, 0 rows affected (0.05 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> create index idx_age on tableX(age);
Query OK, 0 rows affected (0.03 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> create index idx_phone on tableX(phone);
Query OK, 0 rows affected (0.03 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> show index from tableX;
+--------+------------+-----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+
| Table  | Non_unique | Key_name  | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | Visible | Expression |
+--------+------------+-----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+
| tableX |          0 | PRIMARY   |            1 | id          | A         |           1 |     NULL |   NULL |      | BTREE      |         |               | YES     | NULL       |
| tableX |          1 | idx_name  |            1 | name        | A         |           1 |     NULL |   NULL | YES  | BTREE      |         |               | YES     | NULL       |
| tableX |          1 | idx_age   |            1 | age         | A         |           1 |     NULL |   NULL | YES  | BTREE      |         |               | YES     | NULL       |
| tableX |          1 | idx_phone |            1 | phone       | A         |           1 |     NULL |   NULL | YES  | BTREE      |         |               | YES     | NULL       |
+--------+------------+-----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+
4 rows in set (0.01 sec)

mysql> select * from tableX;
+----+--------+------+-------+
| id | name   | age  | phone |
+----+--------+------+-------+
|  1 | Jack   |   20 | 180   |
|  2 | Dennis |   22 | 180   |
|  3 | Dennis |   18 | 1790  |
+----+--------+------+-------+
Name Null
  • 390
  • 2
  • 11
  • 1
    Don't assume, use explain to see the query plan. I would assume it will use the age index or the name index, but not both. – ysth Dec 01 '22 at 05:34

1 Answers1

3

Actually, the index which should help here is a compound one:

CREATE INDEX idx ON tableX (name, age, phone)

The above index, if used, would likely have the following steps:

  • The B-tree can be seeked searching for Dennis name records.
  • Once that region of the B-tree is located, the records will then be sorted by age
  • A scan can be performed to fill the result set with Dennis records, sorted by age
  • The index also includes the phone column making this a covering index for the query; all the columns it needs can be read directly from the index.
Schwern
  • 153,029
  • 25
  • 195
  • 336
Tim Biegeleisen
  • 502,043
  • 27
  • 286
  • 360
  • Is `phone` there to make it only have to read the index and not the table? – Schwern Dec 01 '22 at 07:51
  • @Schwern Yes, that is correct. `phone` is included to make it a covering index. On other databases, we might have other options. For example, on SQL Server, we can use `INCLUDE` to include the `phone` value only in the leaf node (without making the entire index larger). – Tim Biegeleisen Dec 01 '22 at 07:56