mysql does full index scan even though there are no more rows

Question

I am trying to find a chain of exactly 2^1024 lenght, here is a simplified version:

CREATE VIEW full2 AS SELECT a.id1,b.id2 FROM gt a JOIN gt b ON a.id2=b.id1;
CREATE VIEW full3 AS SELECT a.id1,b.id2 FROM full2 a JOIN full2 b ON a.id2=b.id1;
CREATE VIEW full4 AS SELECT a.id1,b.id2 FROM full3 a JOIN full3 b ON a.id2=b.id1;

The problem is that a lot of rows die out after after f2, but somehow mysql's execution plan is

+----+-------------+-------+------------+-------+---------------+------+---------+--------------+------+----------+--------------------------+
| id | select_type | table | partitions | type  | possible_keys | key  | key_len | ref          | rows | filtered | Extra                    |
+----+-------------+-------+------------+-------+---------------+------+---------+--------------+------+----------+--------------------------+
|  1 | SIMPLE      | gt1   | NULL       | index | id1,id2       | id2  | 4       | NULL         |   33 |    81.00 | Using where; Using index |
|  1 | SIMPLE      | gt2   | NULL       | ref   | id1,id2       | id2  | 4       | picr.gt1.id1 |    1 |    90.00 | Using where; Using index |
|  1 | SIMPLE      | gt2   | NULL       | ref   | id1,id2       | id1  | 4       | picr.gt1.id2 |    1 |    90.00 | Using where; Using index |
|  1 | SIMPLE      | gt1   | NULL       | ref   | id1,id2       | id2  | 4       | picr.gt2.id1 |    1 |    90.00 | Using where; Using index |
|  1 | SIMPLE      | gt1   | NULL       | ref   | id1,id2       | id1  | 4       | picr.gt2.id2 |    1 |    90.00 | Using where; Using index |
|  1 | SIMPLE      | gt2   | NULL       | ref   | id1,id2       | id2  | 4       | picr.gt1.id1 |    1 |    90.00 | Using where; Using index |
|  1 | SIMPLE      | gt2   | NULL       | ref   | id1           | id1  | 4       | picr.gt1.id2 |    1 |   100.00 | Using index              |
|  1 | SIMPLE      | gt1   | NULL       | ref   | id2           | id2  | 4       | picr.gt2.id1 |    1 |   100.00 | Using index              |
+----+-------------+-------+------------+-------+---------------+------+---------+--------------+------+----------+--------------------------+

Which I should explain:

for each row in comparison1 that id2 MIGHT exist in comparison.id1
 for each row in comparison2 that id2 MIGHT exist in comparison.id1 and id1 MIGHT exist in comparison.id2
  for each row in comparison3 that id2 MIGHT exist in comparison.id1 and id1 MIGHT exist in comparison.id2
   ...
         does comparison1.id2=comparison2.id1 and comparison2.id2=comparison3.id1 comparison3.id2=comparison4.id1 ... 
   ...
  }
 }
}

This is a huge drain on query time. This is because I actually don't have any rows in f2(not yet anyway) and if nothing is in f2 then joining f2 (nothing) to actually any other query should immediately stop the sql calculation(yet it might start right to left, but same problem, right side will also be empty). Calculation time rises exponentially on each new view.

How can I do something like this?

for each row in comparison1 
 for each row in comparison2 that id1 exists in comparison1 .id2
 if fail then skip
  for each row in comparison3 that id1 exists in comparison2 .id2
  if fail then skip
   ...
         does comparison1.id2=comparison2.id1 and comparison2.id2=comparison3.id1 comparison3.id2=comparison4.id1 ... 
   ...
  }
 }
}

It is not possible? Yes it is! This is a more weird query, won't go into details(it was supposed to join everything from lenght 1 to 1024 without any unions and saves id1,id2 and previous id2 and has huge where condition) but in essence it does more and faster!

mysql> describe select * from part4_v2_3colleftjoin;
+----+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+------------------------------------------------+
| id | select_type | table | partitions | type  | possible_keys | key  | key_len | ref  | rows | filtered | Extra                                          |
+----+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+------------------------------------------------+
|  1 | SIMPLE      | p     | NULL       | index | NULL          | id2  | 4       | NULL |   33 |   100.00 | Using index                                    |
|  1 | SIMPLE      | f     | NULL       | ALL   | id1,id2       | NULL | NULL    | NULL |   33 |   100.00 | Range checked for each record (index map: 0x3) |
|  1 | SIMPLE      | p     | NULL       | ALL   | id1,id2       | NULL | NULL    | NULL |   33 |   100.00 | Range checked for each record (index map: 0x3) |
|  1 | SIMPLE      | f     | NULL       | ALL   | id1,id2       | NULL | NULL    | NULL |   33 |   100.00 | Range checked for each record (index map: 0x3) |
|  1 | SIMPLE      | p     | NULL       | index | id1           | id2  | 4       | NULL |   33 |   100.00 | Using where; Using index                       |
|  1 | SIMPLE      | f     | NULL       | ALL   | id1,id2       | NULL | NULL    | NULL |   33 |   100.00 | Range checked for each record (index map: 0x3) |
|  1 | SIMPLE      | p     | NULL       | ALL   | id1,id2       | NULL | NULL    | NULL |   33 |   100.00 | Range checked for each record (index map: 0x3) |
|  1 | SIMPLE      | f     | NULL       | ALL   | id1,id2       | NULL | NULL    | NULL |   33 |   100.00 | Range checked for each record (index map: 0x3) |
+----+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+------------------------------------------------+

For anyone interested in the overly complicated select here it is `create or replace view part2 as select p.id1, p.id2, CASE WHEN p.id2=f.id1 THEN f.id2 ELSE null END AS id3 from gt p left join gt f on p.id2=f.id1 or (p.id1=f.id1 and p.id2=f.id2); create or replace view part3 as select p.id1,coalesce(p.id3,p.id2) id2, f.id3 from part2 p left join part2 f on p.id3 is not null and p.id3=f.id1 and f.id3 is not null or (p.id3 is not null and f.id3 is null and f.id2=p.id2);`,2 is unique, 3,4,5 ... are identical. — Then Enok, Oct 16 '17 at 20:03
Is this a question? Or a trick you want to share with us? If it's the latter, you should do it as a Q&A: write a question, then self-answer it. It's allowed and encouraged to do so - but make the actual question clear (e.g. add sample data and result, execution times, explain what `f2` from your text is, ...). Also be aware that for some data, your last execution plan might take days. It may be fast for specific data (short chains?), but since MySQL doesn't know that, it prefers a "safe approach" (for most situations) and you can "beat it" with prior knowledge. How do you e.g. prevent loops? — Solarflare, Oct 17 '17 at 07:18
You might have a point.For the moment I have no idea what to do, all I want is for mysql to not loop table `gt` 1024 times if there are no more rows found previously(ie when `select * from part3` if subquery 1 `part2 a` returns no rows then i expect `part2 b` to not matter in any way and to not calculate it and return no rows for part3. Not calculate both subqueries one after the other. I know it's not simoultaneous because it takes exponentially more time). I used `f2` for the `full2` view, without even noticing. Times: f2=.01s f3=.01 f4=.01 f5=.02 f6=.26 f7=11.93 f9>5min — Then Enok, Oct 17 '17 at 23:36
As I said: to make your question more understandable, you should add e.g. sample data, as currently, it's hard to follow (it will, as mentioned, really really really depend on your data). But generally: your (recursive) task (a graph search) is completely unsuited for a database (including MySQL 8+ or MariaDB 10.2+, which support recursive ctes), see e.g. [here](https://stackoverflow.com/q/4074794/6248528). Take the advice in the second answer, and implement it outside sql (or at least in a procedure) ([On the complexity of path problems](https://en.wikipedia.org/wiki/Longest_path_problem)). — Solarflare, Oct 18 '17 at 05:41

mysql does full index scan even though there are no more rows

0 Answers0