5

I switched an app from Laravel 5.1 to Laravel 5.8 by setting up a fresh 5.8 project and copying over the files, making some adjustments here and there.

The issue is that the queries with whereHas have become extremely slow.

Here is an example code:

Article::whereHas('categories', function ($category) {
            $category->where('link', 'foto');
        })
        ->active()
        ->recent()
        ->take(3)
        ->get();

This code generates the following query on Laravel 5.1 and completes in 0.05-0.07 seconds.

SELECT *
FROM `articles`
WHERE `articles`.`deleted_at` IS NULL
  AND
    (SELECT count(*)
     FROM `categories`
     INNER JOIN `article_category` 
       ON `categories`.`id` = `article_category`.`category_id`
     WHERE `article_category`.`article_id` = `articles`.`id`
       AND `link` = 'foto'
       AND `categories`.`deleted_at` IS NULL) >= 1
ORDER BY IFNULL(published_at, created_at) DESC
LIMIT 3

and here's the explain:

+------+--------------------+------------------+------+--------------------------------------------------------------------------+-------------------------------------+---------+-----------------+------+----------+------------------------------------+
| id   | select_type        | table            | type | possible_keys                                                            | key                                 | key_len | ref             | rows | filtered | Extra                              |
+------+--------------------+------------------+------+--------------------------------------------------------------------------+-------------------------------------+---------+-----------------+------+----------+------------------------------------+
|    1 | PRIMARY            | articles         | ALL  | NULL                                                                     | NULL                                | NULL    | NULL            | 4846 |   100.00 | Using where; Using filesort        |
|    2 | DEPENDENT SUBQUERY | categories       | ref  | PRIMARY,categories_link_index                                            | categories_link_index               | 767     | const           |    1 |   100.00 | Using index condition; Using where |
|    2 | DEPENDENT SUBQUERY | article_category | ref  | article_category_category_id_foreign,article_category_article_id_foreign | article_category_article_id_foreign | 4       | lcf.articles.id |    1 |   100.00 | Using where                        |
+------+--------------------+------------------+------+--------------------------------------------------------------------------+-------------------------------------+---------+-----------------+------+----------+------------------------------------+

While on Laravel 5.8 it generates the following query that runs 10-13 seconds.

SELECT *
FROM `articles`
WHERE EXISTS
    (SELECT *
     FROM `categories`
     INNER JOIN `article_category` 
       ON `categories`.`id` = `article_category`.`category_id`
     WHERE `articles`.`id` = `article_category`.`article_id`
       AND `link` = 'foto'
       AND `categories`.`deleted_at` IS NULL)
  AND `articles`.`deleted_at` IS NULL
ORDER BY IFNULL(published_at, created_at) DESC
LIMIT 3

and here's the explain

+------+--------------+------------------+------+--------------------------------------------------------------------------+--------------------------------------+---------+-------------------+------+----------+------------------------------------+
| id   | select_type  | table            | type | possible_keys                                                            | key                                  | key_len | ref               | rows | filtered | Extra                              |
+------+--------------+------------------+------+--------------------------------------------------------------------------+--------------------------------------+---------+-------------------+------+----------+------------------------------------+
|    1 | PRIMARY      | <subquery2>      | ALL  | distinct_key                                                             | NULL                                 | NULL    | NULL              |  107 |   100.00 | Using temporary; Using filesort    |
|    1 | PRIMARY      | articles         | ALL  | PRIMARY                                                                  | NULL                                 | NULL    | NULL              | 4846 |    75.01 | Using where                        |
|    2 | MATERIALIZED | categories       | ref  | PRIMARY,categories_link_index                                            | categories_link_index                | 767     | const             |    1 |   100.00 | Using index condition; Using where |
|    2 | MATERIALIZED | article_category | ref  | article_category_category_id_foreign,article_category_article_id_foreign | article_category_category_id_foreign | 4       | lcf.categories.id |  107 |   100.00 |                                    |
+------+--------------+------------------+------+--------------------------------------------------------------------------+--------------------------------------+---------+-------------------+------+----------+------------------------------------+

I ran both codebases on the same server, same MariaDB 10.2.24 database. The dataset size is approximately 6k articles, 80 categories and 10k records in the pivot.

What should I do here? So far I have discovered a bit more than 10 queries suffering from this problem in the codebase. Can I somehow flip a switch in config and make them all check the existence using the old way? Or should I somehow instruct every query to improve their plan?

UPDATE

I just noticed that if I use whereHas(..., '>', 0) I can get almost the old query (actually WHERE (SELECT COUNT...) > 0) with the old performance. However, whereHas(..., '>=', 1) does reduce itself to query with EXISTS. A question remains whether I could switch this behaviour over whole app without editing each query.

ANSWERS TO COMMENTS

Indexes onr articles

+----------+------------+----------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table    | Non_unique | Key_name                   | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+----------+------------+----------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| articles |          0 | PRIMARY                    |            1 | id          | A         |        4846 |     NULL | NULL   |      | BTREE      |         |               |
| articles |          1 | articles_author_id_foreign |            1 | author_id   | A         |          18 |     NULL | NULL   | YES  | BTREE      |         |               |
+----------+------------+----------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+

Indexes on article_category

+------------------+------------+--------------------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table            | Non_unique | Key_name                             | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+------------------+------------+--------------------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| article_category |          0 | PRIMARY                              |            1 | id          | A         |        9676 |     NULL | NULL   |      | BTREE      |         |               |
| article_category |          1 | article_category_category_id_foreign |            1 | category_id | A         |          90 |     NULL | NULL   |      | BTREE      |         |               |
| article_category |          1 | article_category_article_id_foreign  |            1 | article_id  | A         |        9676 |     NULL | NULL   |      | BTREE      |         |               |
+------------------+------------+--------------------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+

The data to run the examples can be found here: https://gist.github.com/tontonsb/b97bc33066a67e9d8bc3654f2c01c103

This runs faster, but it's still 2.8 vs 0.07 seconds so the problem can be clearly seen, at least on MariaDB 10.2.24. Probably the speed improved because I have removed other columns and their indices.

Jonas Staudenmeir
  • 24,815
  • 6
  • 63
  • 109
Džuris
  • 2,115
  • 3
  • 27
  • 55
  • 1
    Normally, Query builder is slightly faster than eloquent, so you can try executing raw queries. – Sagar Gautam Jul 27 '19 at 09:52
  • 1
    Can you show index on the `articles` and `article_category` table? It will help. – Zeshan Jul 27 '19 at 09:52
  • Here is issue about whereHas https://github.com/laravel/framework/issues/18415 – Sagar Gautam Jul 27 '19 at 09:58
  • @ZeshanKhattak I added them to the answer – Džuris Jul 27 '19 at 10:02
  • @SagarGautam that issue is about performance of `where (select count(*) ...) >= ...` being worse than a join. In my case I am happy with that performance but I get `where exists (select * ...)` instead in 5.8. – Džuris Jul 27 '19 at 10:05
  • Can you provide sample data that demonstrates the performance issue? – Jonas Staudenmeir Jul 27 '19 at 13:18
  • @JonasStaudenmeir I have published dumps without the actual content columns: https://gist.github.com/tontonsb/b97bc33066a67e9d8bc3654f2c01c103 – Džuris Jul 27 '19 at 14:29
  • Can you test if there is a difference if you use the second query, but place the `WHERE \`articles\`.\`deleted_at\` IS NULL` part before the `EXISTS` condition? Because in theory, an `EXISTS` query should **always** be faster than a `COUNT(*)`, as it only needs to find one matching element and not count all of them. – Namoshek Jul 27 '19 at 14:40
  • @Namoshek yeah, I had tried swapping them and there is no change in either the performance or the query plan. I think the problem here is not that `EXISTS` is slow itself, but it causes a plan change. – Džuris Jul 27 '19 at 14:46
  • Some of it might also be a server/MariaDB issue. The slower query ran below 3 seconds on a fresh DB on that 10.2.24 server and under 1 second on 10.3.16. However that's still an order slower than the first query. The plans doesn't change across the versions. – Džuris Jul 27 '19 at 20:11

1 Answers1

0

Try this:

$articles = Article::query()
    ->hasByNonDependentSubquery('categories', function ($category) {
        $category->where('link', 'foto');
    })
    ->active()
    ->recent()
    ->take(3)
    ->get();
mpyw
  • 5,526
  • 4
  • 30
  • 36