2

Ok, so this is basically a yes/no answer. I have a node application running on Heroku with a postgres plan. I use Objection.js as ORM. I'm facing 300+ms response times on most of the endpoints, and I have a theory about why this is. I would like to have this confirmed.

Most of my api endpoints do around 5-10 eager loads. Objection.js handles eager loading by doing additional WHERE IN queries and not by doing one big query with a lot of JOINS. The reason for this is that it was easier to build that way and that it shouldn't harm performance too much.

But that made me thinking: Heroku Postgres doesn't run on the same heroku dyno as the node application I assume, so that means there is latency for every query. Could it be that all these latencies add up, causing a total of 300ms delay?

Summarizing: would it be speedier to build your own queries with Knex instead of generating them through Objection.js, in case you have a separately hosted database?

Gersom
  • 651
  • 3
  • 19
  • You can try turning on slow_log_query on postgres to see if the query is slow or It's just because of network latency. – Dat Tran Aug 10 '18 at 09:32
  • Heroku has a special dashboard for time consuming queries. I'm looking at that now, and no query takes longer than 1ms – Gersom Aug 10 '18 at 09:42
  • 1
    I don't have much experience with heroku-postgres. But It's absolutely true if there is network latency between your app and your DB, the 1 big query must be a big optimization compared to 1 main query and 5-10 subqueries to get the data back. Morover, because of the network latency, if you can reduce your payload size sending over the network (choose what column to query), It will be also a significant optimization. – Dat Tran Aug 10 '18 at 10:01

3 Answers3

2

Speculating why this kind of slowdowns happen is mostly useless (I'll speculate a bit in the end anyways ;) ). First thing to do in this kind of situation would be to measure where that 300ms are used.

From database you should be able to see query times to spot if there are any slow queries causing the problems.

Also knex outputs some information about performance to console when you run it with DEBUG=knex:* environment variable set.

Nowadays also node has builtin profiling support which you can enable by setting --inspect flag when starting node. Then you will be able to connect your node process with chrome dev tools and see where node is using its time. From that profile you will be able to see for example if database query result parsing is dominating execution time.

The best way to figure out the slowness is to isolate the slow part of the app and inspect that carefully or even post that example to stackoverflow that other people can tell you why it might be slow. General explanation like this doesn't give much tool for other to help to resolve the real issue.

Objection.js handles eager loading by doing additional WHERE IN queries and not by doing one big query with a lot of JOINS. The reason for this is that it was easier to build that way and that it shouldn't harm performance too much.

With objection you can select which eager algorithm you like to use. In most of the cases (when there are one-to-many or many-to-many relations) making multiple queries is actually more performant compared to using join, because with join amount of data explodes and transfer time + result parsing on node side will take too much time.

would it be speedier to build your own queries with Knex instead of generating them through Objection.js, in case you have a separately hosted database?

Usually no.

Speculation part:

  1. You mentioned that Most of my api endpoints do around 5-10 eager loads.. Most of the times when I have encountered this kind of slowness in queries the reason has been that app is querying too big chunks of data from database. When query returns for example tens of thousands rows, it will be some megabytes of JSON data. Only parsing that amount of data from database to JavaScript objects takes hundreds of milliseconds. If your queries are causing also high CPU load during that 300ms then this might be your problem.

  2. Slow queries. Somtimes database doesn't have indexes set correctly so queries just will have to scan all the tables linearly to get the results. Checking slow queries from DB logs will help to find these. Also if getting response is taking long, but CPU load is low on node process, this might be the case.

Mikael Lepistö
  • 18,909
  • 3
  • 68
  • 70
  • Thanks for your extensive answer!! But i've already done a lot of query optimization and even my full text searches take less than 1 millisecond. Data fetching is paginated, so limited to less than 100 rows. So I think that rules out both of your speculation points ;) – Gersom Aug 10 '18 at 10:45
  • @gersom well then it seems that you will need to isolate that slow query part and profile the code to find out what is the real problem. Btw. if you are fetching 100 rows, where you eager 5-10 times 100 rows it will be already easily 50000 rows in total. Good luck for findingthe root cause :) – Mikael Lepistö Aug 10 '18 at 10:54
  • I'm trying to write a single pure Knex query now that achieves the equivalent, but with subqueries and `json_agg`. I've already added 1 many-one join and 1 many-many join, and API response time is down to 39ms :) – Gersom Aug 10 '18 at 10:59
2

Hereby I can confirm that every query takes approx. 30ms, regardless the complexity of the query. Objection.js indeed does around 10 separate queries because of the eager loading, explaining the cumulative 300ms.


Just FYI; I'm going down this road now ⬇

I've started to delve into writing my own more advanced SQL queries. It seems like you can do pretty advanced stuff, achieving similar results to the eager loading of Objection.js

select 
  "product".*,
  json_agg(distinct brand) as brand,
  case when count(shop) = 0 then '[]' else json_agg(distinct shop) end as shops,
  case when count(category) = 0 then '[]' else json_agg(distinct category) end as categories,
  case when count(barcode) = 0 then '[]' else json_agg(distinct barcode.code) end as barcodes
from "product"
inner join "brand" on "product"."brand_id" = "brand"."id"
left join "product_shop" on "product"."id" = "product_shop"."product_id"
left join "shop" on "product_shop"."shop_code" = "shop"."code"
left join "product_category" on "product"."id" = "product_category"."product_id"
left join "category" on "product_category"."category_id" = "category"."id"
left join "barcode" on "product"."id" = "barcode"."product_id"
group by "product"."id"

This takes 19ms on 1000 products, but usually the limit is 25 products, so very performant.

Gersom
  • 651
  • 3
  • 19
0

As mentioned by other people, objection uses multiple queries instead of joins by default to perform eager loading. It's a safer default than completely join based loading which can become really slow in some cases. You can read more about the default eager algorithm here.

You can choose to use the join based algoritm simply by calling the joinEager instead method of eager. joinEager executes one single query.

Objection also used to have a (pretty stupid) default of 1 parallel queries per operation, which meant that all queries inside an eager call were executed sequentially. Now that default has been removed and one should get a better performance even in cases like yours.

Your trick of using json_agg is pretty clever and actually avoids the slowness problems I referred to that can arise in some cases when using joinEager. However, that cannot be easily used for nested loading or with other database engines.

Sami Koskimäki
  • 250
  • 2
  • 7
  • Thanks for the heads up! joinEager wasn't an option due to the broken pagination. The problem was indeed in the sequential queries, so if most queries will be done parallel now, I guess I can factor out my custom raw query :) Do you know from which version and on this changed? – Gersom Nov 19 '18 at 13:42