2

I have 2 query about ordering data:

Query 1:

SELECT  * FROM    (
    SELECT      idprovince, COUNT(*) total
    FROM        cities
    JOIN        persons USE INDEX (index_5) USING (idcity)
    WHERE       is_tutor = 'Y'
    GROUP BY    idprovince
) A
ORDER BY total DESC

Query 2:

SELECT      idprovince, COUNT(*) total
FROM        cities
JOIN        persons USE INDEX (index_5) USING (idcity)
WHERE       is_tutor = 'Y'
GROUP BY    idprovince
ORDER BY    total DESC

Query 1 return data much faster than query 2, my question is what is big difference between ordering using query and using it in subquery?

NOTE:my db version is mysql-5.0.96-x64. Data count is about 400k in persons, and 500 in cities.

UPDATE: Output of mysql explain command:

Query 1:

mysql> EXPLAIN
    -> SELECT  *
    -> FROM    (
    ->     SELECT      idprovince, COUNT(*) total
    ->     FROM        cities
    ->     JOIN        persons USE INDEX (index_5) USING (idcity)
    ->     WHERE       is_tutor = 'Y'
    ->     GROUP BY    idprovince
    -> ) A
    -> ORDER BY total DESC
    -> ;
+----+-------------+------------+--------+---------------+---------+---------+------------------------------------+--------+----------------------------------------------+
| id | select_type | table      | type   | possible_keys | key     | key_len | ref                                | rows   | Extra                                        |
+----+-------------+------------+--------+---------------+---------+---------+------------------------------------+--------+----------------------------------------------+
|  1 | PRIMARY     | <derived2> | ALL    | NULL          | NULL    | NULL    | NULL                               |     34 | Using filesort                               |
|  2 | DERIVED     | persons    | ref    | index_5       | index_5 | 2       |                                    | 163316 | Using where; Using temporary; Using filesort |
|  2 | DERIVED     | cities     | eq_ref | PRIMARY       | PRIMARY | 4       | _myproject_lesaja_2.persons.idcity |      1 |                                              |
+----+-------------+------------+--------+---------------+---------+---------+------------------------------------+--------+----------------------------------------------+
3 rows in set (1.22 sec)

Query 2:

mysql> EXPLAIN
    ->     SELECT      idprovince, COUNT(*) total
    ->     FROM        cities
    ->     JOIN        persons USE INDEX (index_5) USING (idcity)
    ->     WHERE       is_tutor = 'Y'
    ->     GROUP BY    idprovince
    ->     ORDER BY    total DESC;
+----+-------------+---------+-------+---------------+-------------+---------+-------+--------+----------------------------------------------+
| id | select_type | table   | type  | possible_keys | key         | key_len | ref   | rows   | Extra                                        |
+----+-------------+---------+-------+---------------+-------------+---------+-------+--------+----------------------------------------------+
|  1 | SIMPLE      | cities  | index | PRIMARY       | FK_cities_1 | 4       | NULL  |      4 | Using index; Using temporary; Using filesort |
|  1 | SIMPLE      | persons | ref   | index_5       | index_5     | 2       | const | 163316 | Using where                                  |
+----+-------------+---------+-------+---------------+-------------+---------+-------+--------+----------------------------------------------+
2 rows in set (0.00 sec)

Result Query 1:

mysql> SELECT  *
    -> FROM    (
    ->     SELECT      idprovince, COUNT(*) total
    ->     FROM        cities
    ->     JOIN        persons USE INDEX (index_5) USING (idcity)
    ->     WHERE       is_tutor = 'Y'
    ->     GROUP BY    idprovince
    -> ) A
    -> ORDER BY total DESC
    -> ;
+------------+-------+
| idprovince | total |
+------------+-------+
|         35 | 15797 |
......................
......................
......................

|         76 |  2091 |
|         65 |  2018 |
+------------+-------+
34 rows in set (0.78 sec)

Result Query 2:

mysql> SELECT      idprovince, COUNT(*) total
    -> FROM        cities
    -> JOIN        persons USE INDEX (index_5) USING (idcity)
    -> WHERE       is_tutor = 'Y'
    -> GROUP BY    idprovince
    -> ORDER BY    total DESC;
+------------+-------+
| idprovince | total |
+------------+-------+
|         35 | 15797 |
|         33 | 14413 |
|         12 | 13683 |
......................
......................
......................
|         34 |  2135 |
|         76 |  2091 |
|         65 |  2018 |
+------------+-------+
34 rows in set (8 min 25.80 sec)

SHOW PROFILE OUTPUT: QUERY 1:

+----------------------+----------+
| Status               | Duration |
+----------------------+----------+
| starting             | 0.000240 |
| Opening tables       | 0.000043 |
| System lock          | 0.000004 |
| Table lock           | 0.000392 |
| optimizing           | 0.000084 |
| statistics           | 0.004455 |
| preparing            | 0.000026 |
| Creating tmp table   | 0.000221 |
| executing            | 0.000002 |
| Copying to tmp table | 0.913722 |
| Sorting result       | 0.000065 |
| Sending data         | 0.000020 |
| removing tmp table   | 0.000145 |
| Sending data         | 0.000008 |
| init                 | 0.000017 |
| optimizing           | 0.000002 |
| statistics           | 0.000038 |
| preparing            | 0.000007 |
| executing            | 0.000001 |
| Sorting result       | 0.000012 |
| Sending data         | 0.000337 |
| end                  | 0.000002 |
| end                  | 0.000002 |
| query end            | 0.000002 |
| freeing items        | 0.000020 |
| closing tables       | 0.000001 |
| removing tmp table   | 0.000074 |
| closing tables       | 0.000003 |
| logging slow query   | 0.000001 |
| cleaning up          | 0.000003 |
+----------------------+----------+

QUERY 2:

+----------------------+------------+
| Status               |   Duration |
+----------------------+------------+
| starting             |   0.000195 |
| Opening tables       |   0.000029 |
| System lock          |   0.000004 |
| Table lock           |   0.000011 |
| init                 |   0.000078 |
| optimizing           |   0.000021 |
| statistics           |   0.003399 |
| preparing            |   0.000025 |
| Creating tmp table   |   0.000259 |
| Sorting for group    |   0.000007 |
| executing            |   0.000001 |
| Copying to tmp table | 506.711308 |
| Sorting result       |   0.000049 |
| Sending data         |   0.000298 |
| end                  |   0.000004 |
| removing tmp table   |   0.000150 |
| end                  |   0.000002 |
| end                  |   0.000002 |
| query end            |   0.000002 |
| freeing items        |   0.000013 |
| closing tables       |   0.000003 |
| logging slow query   |   0.000001 |
| logging slow query   |   0.000042 |
| cleaning up          |   0.000003 |
+----------------------+------------+

CREATE STATEMENT

CREATE TABLE persons (
    idperson INT(10) UNSIGNED NOT NULL AUTO_INCREMENT,
    is_tutor ENUM('Y','N') NULL DEFAULT 'N',
    name VARCHAR(64) NOT NULL,
    ...
    idcity INT(10) UNSIGNED NOT NULL,
    ...
    PRIMARY KEY (idperson),
    UNIQUE INDEX index_3 (name) USING BTREE,
    UNIQUE INDEX index_4 (email) USING BTREE,
    INDEX index_5 (is_tutor),
    ...
    CONSTRAINT FK_persons_1 FOREIGN KEY (idcity) REFERENCES cities (idcity)
)
ENGINE=InnoDB
AUTO_INCREMENT=414738;

CREATE TABLE cities (
    idcity INT(10) UNSIGNED NOT NULL,
    idprovince INT(10) UNSIGNED NOT NULL,
    city VARCHAR(64) NOT NULL,
    PRIMARY KEY (idcity),
    UNIQUE INDEX index_3 (city),
    INDEX FK_cities_1 (idprovince),
    CONSTRAINT FK_cities_1 FOREIGN KEY (idprovince) REFERENCES provinces (idprovince)
)
ENGINE=InnoDB;
bramaningds
  • 41
  • 1
  • 5

1 Answers1

0

I am admittedly not an expert on this one but looking at MySQL Documentation on ORDER BY Optimization, you have not only one but two un-optimized use of ORDER BY in your Query No. 2:

SELECT      idprovince, COUNT(*) total
FROM        cities
JOIN        persons USE INDEX (index_5) USING (idcity)
WHERE       is_tutor = 'Y'
GROUP BY    idprovince
ORDER BY    total DESC

First one :

The key used to fetch the rows

WHERE is_tutor = 'Y'

is not the same as the one used in the ORDER BY:

ORDER BY total DESC

Second one :

You have different ORDER BY and GROUP BY expressions.

GROUP BY    idprovince
ORDER BY    total DESC

On the two cases above MySQL will not use Indexes in order to resolve ORDER BY although it could use indexes in searching for the rows to match the WHERE clause.

On other hand your Query No. 1, follows the optimized form of ORDER BY although the ORDER BY is used outside the sub-query.

Thus could be the reason that Query No. 2 is far slower than Query No. 1.

Additionally, in both cases the Index (idCity) will be virtually useless in resolving also ORDER BY because index uses idCity while ORDER BY clause uses Total which is an aggregate result.

See discussion here also.

Community
  • 1
  • 1
Edper
  • 9,144
  • 1
  • 27
  • 46
  • Hi. Thanks for your respond. 'total' is aggregat function, so it must not use indexes for ordering. Index_5 is index for is_tutor field. – bramaningds Nov 03 '13 at 13:35
  • @bramaningds Could you show profile for your Query No. 1? I could see that your Query No. 2's copying to temporary table has the most time consume in your query. – Edper Nov 03 '13 at 13:40
  • Query 1 takes about 780ms, so i guess it will be below 1 s for all profiles. Now, i m on the way and will be home in 12 hours from now. I will update profile output for query 1 asap. – bramaningds Nov 03 '13 at 14:17