1

In Postgres I have a table that I run a function against which returns the top result. This function takes around 2ms to complete without cached data, which is what I need.

I then drop PostgREST in the mix, since I need to have an HTTP API that will allow services to run this function and use its results.

When I do a curl against the PostgREST API, I get a response time of 0.29 seconds, which is 290ms and ridiculously slower

What is making PostgREST so slow?

Setup

Postgres 12 and PostgREST 7.0.1 are running on the same machine and my requests are coming from the same machine, so there should be very little network latency.

Code

curl -d rating_min=5 -d rating_max=8 http://localhost:7045/rpc/get_json_data -w " %{time_starttransfer}\n"
> [{"json": {"name": "Friend"} }] 0.291
CREATE TABLE my_json_data (
    "json" jsonb NULL,
    mtime timestamp NULL,
    rating float8 NULL
);
CREATE OR REPLACE FUNCTION get_json_data(rating_min float, rating_max float)
RETURNS TABLE(
    "json" jsonb    
) AS
$$
SELECT "json"
FROM my_json_data
WHERE rating BETWEEN rating_min and rating_max
ORDER BY rating
LIMIT 1
$$ LANGUAGE SQL IMMUTABLE;
Werner Raath
  • 1,322
  • 3
  • 16
  • 34
  • FYI - your function needs to return `jsonb` not `json`. As written, it will fail to be created, throwing `error: pq: 42P13: return type mismatch in function declared to return json` – Bill Jetzer Jun 30 '20 at 13:37
  • Thanks Bill, fixed that typo – Werner Raath Jun 30 '20 at 13:41
  • Hmm...[the docs](https://github.com/PostgREST/postgrest) only claim "_subsecond_ response times for up to 2000 requests/sec on Heroku free tier". – Bill Jetzer Jun 30 '20 at 13:48
  • Let me see; 1) Send request to server. 2) Process request 3) Query database and return results 4) Send response back. I think you need to break down where the latency is creeping in. – Adrian Klaver Jun 30 '20 at 13:59
  • @BillJetzer Unfortunately I cannot deploy this to Heroku since they don't have any regions in Africa. In terms of performance, 2000 requests per second would imply a response time of 0.5ms. So in theory what I am seeing shouldn't be happening. – Werner Raath Jul 01 '20 at 08:18
  • @AdrianKlaver Indeed I'm thinking that the request processing and response is the problem, potentially with the serialisation – Werner Raath Jul 01 '20 at 08:19
  • @Werner Which version of PostgREST are you using? There was a performance regression for function calls, which was fixed on the latest version(v7.0.1). – Steve Chavez Jul 01 '20 at 13:58
  • Hey @SteveChávez, I am indeed using 7.0.1 – Werner Raath Jul 02 '20 at 07:53
  • @Werner I see. One more thing, check the exact query PostgREST is generating by watching the [database logs](http://postgrest.org/en/v7.0.0/admin.html#database-logs). Extract the query from the logs and run it on your pg client(psql, pgadmin). Check how the time compares to the original query. – Steve Chavez Jul 02 '20 at 15:11

2 Answers2

3

You are comparing the query speed (in your postgresql client) to the "query speed + db connection time". i.e when you do your curl request, in addition to running the query, postgrest also needs to connect to the database, and this takes time. By default a connection is closed after 10s (look into db-pool-timeout). So this is the reason why your first call is slow, and the second one is fast (nothing to do with query cache plan). If you make a curl request (the first one) it will be slow, the second one (if made immediately after) it will be fast, then wait 10-15s and make the 3rd one, you will see that it will be slow again (since the db connection was closed). What you want is to adjust the db-pool-timeout to like 30m and all your calls will be fast (except the first one)

Ruslan Talpa
  • 533
  • 3
  • 8
  • But if pg + pgrst are on the same machine, the reconnection should take like 3-5ms. +280 ms is looking a bit much. – Steve Chavez Jul 02 '20 at 15:17
  • 1
    Changing the db-pool-timeout was indeed quite a big performance booster! Since I have thousands of queries being performed per second, this made a MASSIVE difference! – Werner Raath Jul 12 '20 at 13:49
0

This looks like a networking issue. You said yourself the the function is running efficiently from within the DB. Seems like the network latency between the computer running curl and the computer serving the Rest API is high. Or it could be related to network bandwidth, if the jsonb column is very big.

Jonathan Jacobson
  • 1,441
  • 11
  • 20