2

I have the following query:

SELECT 
    `Job`.`id`, 
    `Job`.`serviceId`, 
    `Job`.`externalId`, 
    `Job`.`createdAt`,
    `Job`.`updatedAt`,
    `logs`.`id` AS `logs.id`,
    `logs`.`statusId` AS `logs.statusId`,
    `logs`.`createdAt` AS `logs.createdAt`
FROM `Jobs` AS `Job` 
LEFT OUTER JOIN (
    SELECT id, jobId, statusId, createdAt, updatedAt
    FROM JobLogs
    WHERE id IN (
        SELECT MAX(id)
        FROM JobLogs
        GROUP BY jobId
    )
) AS `logs` ON `Job`.`id` = `logs`.`jobId`
ORDER BY `Job`.`id` DESC;

It returns all Jobs with only the latest JobLog. It works fine and testing it in e.g. Sequel Pro connecting to the DB running in a container returns all 1000 jobs I previously added in +/-10ms.

Because my API is written in node and I'm using sequelize/mysql2 under the hood, I have just copied the above query into a sequelize.query() snippet (and for testing purposes a bare-metal mysql2-query-snippet w/ the same result) -> the query takes 35sec mysql process stays at 100% CPU.

Why is the query so much slower when using node/mysql2 compared to e.g. Sequel Pro? What am I missing here?

It's not the LIMIT 1000 the client adds because there are only 1000 entries in the table ...


edit

$ EXPLAIN <SQL> returns the following:

+----+-------------+---------+------------+-------+---------------+---------+---------+-----------+------+----------+--------------------------+
| id | select_type | table   | partitions | type  | possible_keys | key     | key_len | ref       | rows | filtered | Extra                    |
+----+-------------+---------+------------+-------+---------------+---------+---------+-----------+------+----------+--------------------------+
|  1 | PRIMARY     | Job     | NULL       | index | NULL          | PRIMARY | 4       | NULL      | 1000 |   100.00 | NULL                     |
|  1 | PRIMARY     | JobLogs | NULL       | ref   | jobId         | jobId   | 4       | db.Job.id |   13 |   100.00 | Using where              |
|  3 | SUBQUERY    | JobLogs | NULL       | range | jobId         | jobId   | 4       | NULL      | 1000 |   100.00 | Using index for group-by |
+----+-------------+---------+------------+-------+---------------+---------+---------+-----------+------+----------+--------------------------+
3 rows in set, 1 warning (0.00 sec)

And I'm using the mysql:5.7.20 docker container without any extra configuration that I know of ...


edit 2

It doesn't look like it is a cache thing. Dropping and recreating the database and then inserting new (random) values and then doing the first query with Sequel Pro leads to a ~30ms response.

Philipp Kyeck
  • 18,402
  • 15
  • 86
  • 123
  • Before doing any conclusions could you please run `EXPLAIN ` and update your question with details. It's possible that MySQL is not using indices and is performing a full-table scan – Oleg Andreyev Feb 24 '20 at 22:37
  • was your faster job hitting a query cache? Which MySQL version? What configuration? – danblack Feb 24 '20 at 22:58
  • @danblack I added some more info - maybe this helps – Philipp Kyeck Feb 25 '20 at 10:18
  • Im quite sure its mostly due to Sequelize itself. It is an ORM, that means: from simple array returned by undergoing driver (mysql2) it maps the array into objects - creating them from scratch and filling with data. At least i had several of this kind issues in the past so paginating is your friend. Can you tell which process used 100% cpu? If that is node process - then its the mapping, maybe your object is quite complicated in definition. – Seti Mar 09 '20 at 16:04
  • @Seti it was mysql that was using 100%. and I also tried just using plain mysql2 module, so it shouldn't be a problem with mapping etc – Philipp Kyeck Mar 09 '20 at 16:25
  • Can you show us explain of the query but using sequelize.query too (on the database you are using with the long query)? Outer joins are slow themselves – Seti Mar 10 '20 at 07:56
  • @Seti the EXPLAIN is already there - that is the only query I'm running. – Philipp Kyeck Mar 10 '20 at 09:50

0 Answers0