5

Saying I have this query:

EXPLAIN SELECT *
FROM (
    SELECT "A" as a, i.n FROM (SELECT 1 AS n) AS i
    UNION ALL SELECT "B" as a, i.n FROM (SELECT 1 AS n) AS i) AS t
WHERE a = "B";

MySQL says

id  select_type table   partitions  type    possible_keys   key key_len ref rows    filtered    Extra
1   PRIMARY <derived2>  \N  ref <auto_key0> <auto_key0> 6   const   1   100.00  \N
2   DERIVED <derived3>  \N  system  \N  \N  \N  \N  1   100.00  \N
3   DERIVED \N  \N  \N  \N  \N  \N  \N  \N  \N  No tables used
4   UNION   <derived5>  \N  system  \N  \N  \N  \N  1   100.00  \N
5   DERIVED \N  \N  \N  \N  \N  \N  \N  \N  \N  No tables used

So MySQL generated an intermediate index <auto_key0> but what is behind this index? What columns are used in it? And is there a way I can set this index manually, and force MySQL to use some columns.

Xenos
  • 3,351
  • 2
  • 27
  • 50
  • If the real Question is about optimization, then provide the real query and we can discuss it. – Rick James Jul 14 '18 at 04:15
  • @RickJames No, the real question is that I would like to know which columns are indexed by `auto_key0` in general, using this setup as an example. – Xenos Jul 16 '18 at 09:17
  • See if `EXPLAIN FORMAT=JSON SELECT ...` has the answer. I doubt if the columns of `` would be provided anywhere else. – Rick James Jul 16 '18 at 17:54
  • @RickJames Oh, neat! I thought both would return the exact same infos in different format, but `FORMAT=JSON` indeed returns the `key: , used_key_parts: ['a'], key_length: 6, ref: ['const']`. You might put it in an answer then and I'll switch to accept yours. – Xenos Jul 17 '18 at 08:02

2 Answers2

6

EXPLAIN FORMAT=JSON SELECT ... will return something like

    key: <auto_key0>, used_key_parts: ['a'], key_length: 6, ref: ['const']

<auto_key0> is the index generated by the Optimizer for a derived table.

(There is also "Optimizer trace"; but that probably does not have this particular info.)

Rick James
  • 135,179
  • 13
  • 127
  • 222
  • That confirms what the other answer inferred, but now, it's explicitly said from the SGBD itself. Thanks. – Xenos Jul 18 '18 at 07:31
5

https://dev.mysql.com/doc/refman/8.0/en/derived-table-optimization.html says:

SELECT *
FROM t1 JOIN (SELECT DISTINCT f1 FROM t2) AS derived_t2
     ON t1.f1=derived_t2.f1;

The optimizer constructs an index over column f1 from derived_t2 if doing so would enable use of ref access for the lowest cost execution plan.

I take that to mean that the columns of the index are determined by the join expression. The MySQL optimizer knows which column(s) would be useful to index because they are referenced in the ON clause of the join.

I don't know of any way to control the index that is generated for the derived table. The index is short-lived. It's created for the temp table during the query, and it will be dropped naturally when the temp table is dropped at the end of the query.

If you want more control over the indexes, you would have to create your own table (either temporary or permanent) and define indexes for it.

You might also like to read this blog about the subject written by the MySQL optimizer lead: https://mysqlserverteam.com/mysql-5-7-improved-performance-of-queries-with-derived-tables/

Bill Karwin
  • 538,548
  • 86
  • 673
  • 828
  • That indeed explains the origin of the index. And why the `UNION ALL` forbids merging, leading to a slow query when `SELECT 1 AS n` is actually a `SELECT n FROM integers` where `integers` is a table of 1.000.000 integers. Then, I'll ask if there's a way to avoid the `UNION ALL` so subquery can bve merged: should I ask in a new SO question? – Xenos Jul 04 '18 at 09:37
  • 1
    Yes you should ask a new SO question about that. – Bill Karwin Jul 04 '18 at 16:35