9

I am in the process of optimizing the queries in my MySQL database. While using Visual Explain and looking at various query costs, I'm repeatedly finding counter-intuitive values. Operations which use more efficient lookups (e.g. key lookup) seem to have a higher query cost than ostensibly less efficient operations (e.g full table scan or full index scan).

Examples of this can even be seen in the MySQL manual, in the section regarding Visual Explain on this page: enter image description here The query cost for the full table scan is a fraction of the key-lookup-based query costs. I see exactly the same scenario in my own database.

All this seems perfectly backwards to me, and raises this question: should I use query cost as the standard when optimizing a query? Or have I fundamentally misunderstood query cost?

Haydentech
  • 1,096
  • 1
  • 11
  • 27
  • At the end of the day the "actual performance" is usually what matters most.. estimated (and actual) query plans are nice. Having the relevant queries run snapily, on the other hand, is generally a requirement. – user2864740 Aug 17 '17 at 16:27
  • 1
    Also, I'm not sure what the question is asking - the cost of the base FTS is meaningless by itself as it's only one part of the information required to answer the query. (If the table has an appropriate index and amount of data it should change from a full scan to an index scan: this step effectively just "builds the initial N [1000] rows". In this case the planner thought the table was small enough or had no usable index.) – user2864740 Aug 17 '17 at 16:29
  • I understand what each step of the query is doing and how it's doing it. What I don't understand is that, almost without exception, the query planner thinks indexed reads are more expensive than non-indexed reads when common sense would tell you otherwise. For example, in my database I found a query doing a full table scan, so I added an appropriate index, the planner then picked that index, yet EXPLAIN showed that the query cost went **up** after that change. – Haydentech Aug 17 '17 at 18:20

1 Answers1

3

MySQL does not have very good metrics relating to Optimization. One of the better ones is EXPLAIN FORMAT=JSON SELECT ..., but it is somewhat cryptic.

Some 'serious' flaws:

  • Rarely does anything account for a LIMIT.
  • Statistics on indexes are crude and do not allow for uneven distribution. (Histograms are coming 'soon'.)
  • Very little is done about whether data/indexes are currently cached, and nothing about whether you have a spinning drive or SSD.

I like this because it lets me compare two formulations/indexes/etc even for small tables where timing is next to useless:

FLUSH STATUS;
perform the query
SHOW SESSION STATUS LIKE "Handler%";

It provides exact counts (unlike EXPLAIN) of reads, writes (to temp table), etc. Its main flaw is in not differentiating how long a read/write took (due to caching, index lookup, etc). However, it is often very good at pointing out whether a query did a table/index scan versus lookup versus multiple scans.

The regular EXPLAIN fails to point out multiple sorts, such as might happen with GROUP BY and ORDER BY. And "Using filesort" does not necessarily mean anything is written to disk.

Rick James
  • 135,179
  • 13
  • 127
  • 222