How do I identify slow queries missing an index?

Question

Using SET GLOBAL log_slow_verbosity='query_plan,explain'; on a slow log, I get a lot of output, but I am struggling to understand the explanation.

# User@Host: root[root] @  [10.0.1.5]
# Thread_id: 31  Schema: enterprise  QC_hit: No
# Query_time: 0.654855  Lock_time: 0.000245  Rows_sent: 50  Rows_examined: 279419
# Rows_affected: 0
# Full_scan: Yes  Full_join: Yes  Tmp_table: Yes  Tmp_table_on_disk: Yes
# Filesort: Yes  Filesort_on_disk: No  Merge_passes: 0  Priority_queue: Yes
#
# explain: id   select_type     table   type    possible_keys   key     key_len ref     rows    r_rows  filtered        r_filtered      Extra
# explain: 1    SIMPLE  external_property_groups_areas  ALL     unique_id_area,search_id_area,search_country    NULL    NULL    NULL    20      20.00   100.00  100.00  Using temporary; Using filesort
# explain: 1    SIMPLE  external_property_level_1_buildings     ref     unique_id_building,building_id_area_id  building_id_area_id     5       enterprise.external_property_groups_areas.id_area        3  6.00     100.00  100.00
# explain: 1    SIMPLE  external_property_level_2_units ref     unit_building_id,property_level_2_created_by    unit_building_id        4       enterprise.external_property_level_1_buildings.id_building  25.13    100.00  100.00  Using index condition
# explain: 1    SIMPLE  ut_unit_types   eq_ref  unique_property_type,query_optimization_designation      unique_property_type     1022    enterprise.external_property_level_2_units.unit_type  1       1.00    100.00  100.00  Using where; Using index
# explain: 1    SIMPLE  property_level_2_units  eq_ref  PRIMARY,property_level_2_organization_id        PRIMARY 1530    enterprise.external_property_level_2_units.external_id,enterprise.external_property_level_2_units.external_system_id,enterprise.external_property_level_2_units.external_table,const   1       0.98    100.00  100.00
# explain: 1    SIMPLE  a       eq_ref  unique_id_unit,unit_building_id unique_id_unit  4       enterprise.property_level_2_units.system_id_unit 1       0.98    100.00  100.00  Using where
# explain: 1    SIMPLE  c       eq_ref  unique_id_building      unique_id_building      4       enterprise.a.building_system_id  1       1.00    100.00  100.00  Using index
# explain: 1    SIMPLE  b       ref     property_property_type  property_property_type  4       const   142     458.00  100.00  0.17    Using where
# explain: 1    SIMPLE  property_groups_countries       ALL     country_names,coutnry_codes     NULL    NULL    NULL    245     245.00  100.00  0.31    Using where; Using join buffer (flat, BNL join)
#

How do I identify the slow parts to they query?
Is there a shortcut to quickly identify missing indexes?

Also have a screencast of my session

Would be great if you could please point out resources to help me figure out how to improve performance on these SQL queries.

[compound indexes](http://mysql.rjweb.org/doc.php/index1), [query optimization](https://github.com/jynus/query-optimization). Things to look for, tables that aren't using a key (`ALL`), the amount of index used (key_len). `filesort` (really any sort) might matter if there are a lot of rows. — danblack, May 07 '19 at 04:50

jynus · Answer 1 · 2019-05-07T09:38:51.133

Your question is quite general, so I will answer your concrete questions, and then send you where you can find more information.

The explain on logs are ok, but you are not alone on not being able to read them. Use the logs to identify your slow queries. Use EXPLAIN later (and other tools) to debug what is going on. It is nice to have it on the log, but please format it live on your database like this for better readability:

Answering your questions:

How do I know if an index is not being used?

type (and key) columns will tell you. type ALL means a full table scan is being used, and possible keys/key will be NULL. That is for scanning. type const, ref or range is normally preferred (there are more strategies). For sorting (and other issues), you will find on Extra: the string Using filesort. That means a second pass to sort results is needed, and in some cases an index will help getting results automatically in order.

Here is an example of another query, this time using an index:

This is a simplification, because there are many ways in which an index can be used to speed up results (ICP, covering index, max(), ...).

There is no enough space here to talk also about JOINs and subqueries, where order and rewriting is possible to get better strategies.

How do I identify the slow parts to they query?

There is 2 options:

profiling the query (which will give you the per-stage time spent on each query step), which can be done with show profile or enabling it with performance_schema for certain queries. Typical output:

SHOW PROFILE CPU FOR QUERY 5;
+----------------------+----------+----------+------------+
| Status               | Duration | CPU_user | CPU_system |
+----------------------+----------+----------+------------+
| starting             | 0.000042 | 0.000000 |   0.000000 |
| checking permissions | 0.000044 | 0.000000 |   0.000000 |
| creating table       | 0.244645 | 0.000000 |   0.000000 |
| After create         | 0.000013 | 0.000000 |   0.000000 |
| query end            | 0.000003 | 0.000000 |   0.000000 |
| freeing items        | 0.000016 | 0.000000 |   0.000000 |
| logging slow query   | 0.000003 | 0.000000 |   0.000000 |
| cleaning up          | 0.000003 | 0.000000 |   0.000000 |
+----------------------+----------+----------+------------+

handler statistics, which will give you a time-independent metric of the scan strategy and number of rows scanned for each one:

This last one may seem a bit unfriendly, but once you understand it, you can see index usage and full scans much easily, by knowing what the internal engine calls are.

Is there a shortcut to quickly identify missing indexes?

Yes, if you have enabled performance_schema and have access to the sys database SELECT * FROM sys.statement_analysis; will give you a column called "full_scan" which will give you queries that perform full scans (scans not using indexes). You can then order by rows_examined, rows_examined_avg, avg_latency, etc. to go in order of importance.

If you don't want or cannot use performance_schema, use the logs to get those numbers aggregated with pt-query-digest from percona-toolkit:

If rows examined is very large compared to rows sent, an index is probably the cause.

In summary, the logs are ok for identifying queries- use them to aggregate them with performance_schema or pt-query-digest. But once you identify the worst offending queries, use other tools for debugging.

I talk more in extension on how to identify slow queries and the details on how do query optimization on my slides "Query Optimization with MySQL 8.0 and MariaDB 10.3". I do this for a living and query optimization is my passion, I suggest you have a look at them (I am not selling you a book, they are free and with a Creative Commons license).

Great start + 1 but query optimisation has much more to that.. The MySQL optimizer can sometimes rewrite SQL to optimize or choose not to use a index when a full scan is cheaper — Raymond Nijland, May 07 '19 at 09:36
@Raymond Nijland I talk about that at https://www.slideshare.net/jynus/query-optimization-with-mysql-80-and-mariadb-103-the-basics/54 I cannot summarize 200+ slides here :-D — jynus, May 07 '19 at 09:42
Also tracing the optimizer is a thing https://dev.mysql.com/doc/internals/en/optimizer-tracing.html to add a suggestion to these good resources.. if its not already there in the slides which i didnt look into — Raymond Nijland, May 07 '19 at 09:44
No problem i didnt look into the slices and indeed true you cant place all the information here — Raymond Nijland, May 07 '19 at 09:44

Rick James · Answer 2 · 2021-02-10T07:18:59.487

-1

"slow queries not using an index" -- Who cares? What you should care about is "slow queries". Then in trying to figure out why a query is slow, you may or may not discover an index problem.

Use the digest described by jynus and show us the 'worst' query. Provide SHOW CREATE TABLE for the table(s) it uses and EXPLAIN SELECT .... Then we can walk through it an find out whether an index is at fault. Or what is at fault.

More on digesting, etc: http://mysql.rjweb.org/doc.php/mysql_analysis#slow_queries_and_slowlog

Re video

Alas, EXPLAIN fails to tell you what index is needed or even which table needs a better index.

When presenting text, please use text, not video.

The "Rows" and "r_rows", when high, give you a clue that that table might be lacking a good index. Please provide SHOW CREATE TABLE so we can see what indexes you already have.

If possible, use short aliases; the long ones lead to a lot of clutter.

edited Feb 10 '21 at 07:18

answered May 10 '19 at 00:35

Rick James

135,179
13
127
222

While this link may answer the question, it is better to include the essential parts of the answer here and provide the link for reference. Link-only answers can become invalid if the linked page changes. - [From Review](/review/low-quality-posts/28287618) – Trenton McKinney Feb 10 '21 at 06:42
1

@TrentonMcKinney - Ding me all you want. I have developed about 20 articles that I point to repeatedly rather than spending time repeating myself. They have been up for a decade and I have not intent of taking them down. Furthermore, I occasionally update them to keep up with new releases of MySQL/MariaDB. – Rick James Feb 10 '21 at 07:53
I'm not the one who downvoted your answer. – Trenton McKinney Feb 10 '21 at 14:50

How do I identify slow queries missing an index?

2 Answers2

Linked