8

I have database with few hundred millions of rows. I'm running the following query:

select * from "Payments" as p
inner join "PaymentOrders" as po
on po."Id" = p."PaymentOrderId"
inner join "Users" as u
On u."Id" = po."UserId"
INNER JOIN "Roles" as r
on u."RoleId" = r."Id"
Where r."Name" = 'Moses'
LIMIT 1000

When the where clause finds a match in database, I get the result in several milliseconds, but if I modify the query and specify a non-existent r."Name" in where clause, it takes too much time to complete. I guess that PostgreSQL is doing a sequential scan on the Payments table (which contains the most rows), comparing each row one by one.

Isn't postgresql smart enough to check first if Roles table contains any row with Name 'Moses'?

Roles table contains only 15 row, while Payments contains ~350 million.

I'm running PostgreSQL 9.2.1.

BTW, this same query on the same schema/data takes 0.024ms to complete on MS SQL Server.

I'll update the question and post EXPLAIN ANALYSE data in a few hours.


Here'e explain analyse results: http://explain.depesz.com/s/7e7


And here's server configuration:

version PostgreSQL 9.2.1, compiled by Visual C++ build 1600, 64-bit
client_encoding UNICODE
effective_cache_size    4500MB
fsync   on
lc_collate  English_United States.1252
lc_ctype    English_United States.1252
listen_addresses    *
log_destination stderr
log_line_prefix %t 
logging_collector   on
max_connections 100
max_stack_depth 2MB
port    5432
search_path dbo, "$user", public
server_encoding UTF8
shared_buffers  1500MB
TimeZone    Asia/Tbilisi
wal_buffers 16MB
work_mem    10MB

I'm running postgresql on a i5 cpu (4 core, 3.3 GHz), 8 GB of RAM and Crucial m4 SSD 128GB


UPDATE This looks like a bug in query planner. With the recomendation of Erwin Brandstetter I reported it to Postgresql bugs mailing list.

Davita
  • 8,928
  • 14
  • 67
  • 119
  • 4
    For now, please post the plain `EXPLAIN` to http://explain.depesz.com/ and link to it in your post. It'll run instantly. Thanks for including your PostgreSQL version and useful details. The other thing to try: `ANALYZE` the tables and see if performance improves. Maybe your stats are totally out? You can see this in `explain analyze` too, it's the most common cause of rowcount estimates being totally wrong. Please add a comment when you update, as SO doesn't send notifications on question edits. – Craig Ringer Nov 15 '12 at 22:46
  • @CraigRinger thank you very much for your help. I have analyzed all tables in database and executed explain query only. here's the result: http://explain.depesz.com/s/hol. however, explain analyze is still running and I'll post the results as soon as it finishes. Thanks again. – Davita Nov 16 '12 at 09:25
  • @CraigRinger sorry Craig, I don't understand what you mean. I'm not native English speaker :). anyway, here's explain analyse results: http://explain.depesz.com/s/7e7 any idea how to fix this slow query? Thank you very much again :) – Davita Nov 16 '12 at 10:32
  • That's fine. What I'm asking for is `EXPLAIN ANALYZE` (or at least `EXPLAIN`) for the query when it's fast. You've shown what the problem query plan is. It would help to see what PostgreSQL chooses when a name does exist, the query plan that produces a result quickly. – Craig Ringer Nov 16 '12 at 10:45
  • @CraigRinger aa right. here it is: http://explain.depesz.com/s/YY5Y. I think it's same plan, but thanks to LIMIT, rows are hit faster this way and it doesn't have to go through whole table..? I'm very new to PostgreSQL, sorry for my poor understanding :) – Davita Nov 16 '12 at 10:46
  • I *think* what is happening here is that PostgreSQL expects to only execute a small part of the nested loop before finding enough rows to satisfy your `LIMIT`. That's the case when the role is found, since enough rows to satisfy the limit are quickly found. If the role isn't found, though, it keeps on looking, trying to satisfy the join condition `(u."RoleId" = r."Id")` for enough rows. Pg is executing this query in a weird order. It seems very odd not to search the roles table *first*. This doesn't appear to be a statistics issue, as the row count estimates are all bang-on for the slow plan. – Craig Ringer Nov 16 '12 at 11:02
  • 1
    I would be interested in seeing if adding a `UNIQUE` constraint (and thus `UNIQUE` index) on `Roles.RoleId` had any effect, if that's compatible with your data model. Please also show your configuration from http://wiki.postgresql.org/wiki/Server_Configuration and have a read of http://wiki.postgresql.org/wiki/Slow_Query_Questions . I'm particularly interested in your `join_collapse_limit`, though it should default to high enough not to matter for this query. – Craig Ringer Nov 16 '12 at 11:03
  • @CraigRinger I updated my post. If I understand correctly, Roles.Id column is a primary key and it's already unique I think. Thanks for the link, I will take a look and see if I can find any solution. It's really weird :/ – Davita Nov 16 '12 at 11:17
  • I'd recommend posting on the pgsql-performance mailing list too, and linking back to this question. Include the query you showed, both explain analyzes with explanations of which is which, and psql `\d+` output for your tables, trimmed of unnecessary columns if you desire. – Craig Ringer Nov 16 '12 at 11:21
  • @CraigRinger thanks Craig for your time. I will try that :) – Davita Nov 16 '12 at 11:25
  • What happens if you remove the `LIMIT` clause for the case where no role is found? I have a suspicion that it will result in the fast plan - making `LIMIT` the culprit here. – Erwin Brandstetter Nov 16 '12 at 12:29
  • Also: please add the table definitions (including indexes) I agree that LIMIT stinks (but LIMIT without ORDER BY should be Ok, BTW), but I think there is a cathesial smell as well here. IOW: Simpilfy, use a `count(*)` or something. – wildplasser Nov 17 '12 at 00:24

2 Answers2

11

As suggested a couple times on the thread on the PostgreSQL community performance list, you can work around this issue by forcing an optimization barrier using a CTE, like this:

WITH x AS
(
SELECT *
  FROM "Payments" AS p
  JOIN "PaymentOrders" AS po ON po."Id" = p."PaymentOrderId"
  JOIN "Users" as u ON u."Id" = po."UserId"
  JOIN "Roles" as r ON u."RoleId" = r."Id"
  WHERE r."Name" = 'Moses'
)
SELECT * FROM x
  LIMIT 1000;

You may also get a good plan for your original query if you set a higher statistics target for "Roles"."Name" and then ANALYZE. For example:

ALTER TABLE "Roles"
  ALTER COLUMN "Name" SET STATISTICS 1000;
ANALYZE "Roles";

If it expects fewer matching rows to exist in the table, as it is likely to do with more fine-grained statistics, it will assume that it needs to read a higher percentage of the table to find them on a sequential scan. This may cause it to prefer using the index instead of sequentially scanning the table.

You might also get a better plan for the original query by adjusting some of the planner's costing constants and caching assumptions. Things you could try in a single session, with the SET command:

  • Reduce random_page_cost. This is largely based on how heavily cached your data is. Given a table with hundreds of millions of rows you probably don't want to go below 2; although if the active data set in your database is heavily cached you can reduce it all the way down to the setting for seq_page_cost, and you may want to reduce both of them by an order of magnitude.

  • Make sure that effective_cache_size is set to the sum of shared_buffers and whatever your OS is caching. This doesn't allocate any memory; it just tells the optimizer how likely index pages are to remain in cache during heavy access. A higher setting makes indexes look better when compared to sequential scans.

  • Increase cpu_tuple_cost to somewhere in the range of 0.03 to 0.05. I have found the default of 0.01 to be too low. I often get better plans by increasing it, and have never seen a value in the range I suggested cause worse plans to be chosen.

  • Make sure that your work_mem setting is reasonable. In most environments that I've run PostgreSQL, that is in the 16MB to 64MB range. This will allow better use of hash tables, bitmap index scans, sorts, etc., and can completely change your plans; almost always for the better. Beware setting this to a level that yields good plans if you have a large number of connections -- you should allow for the fact that each connection can allocate this much memory per node of the query it is running. The "rule of thumb" is to figure you will hit peaks around this setting times max_connections. This is one of the reasons that it is wise to limit your actual number of database connections using a connection pool.

If you find a good combination of settings for these, you might want to make those changes to your postgresql.conf file. If you do that, monitor closely for performance regressions, and be prepared to tweak the settings for the best performance of your overall load.

I agree that we need to do something to nudge the optimizer away from "risky" plans, even if they look like they will run faster on average; but I will be a little surprised if tuning your configuration so that the optimizer better models the actual costs of each alternative doesn't cause it to use an efficient plan.

kgrittn
  • 18,113
  • 3
  • 39
  • 47
5

Finally successful attempt

My other idea - as per comment:
What happens if you remove the LIMIT clause for the case where no role is found? I have a suspicion that it will result in the fast plan - making LIMIT the culprit here.

You may be able to solve your problem by pushing down your query into a subquery and applying the LIMIT only to the outer query (untested):

SELECT *
FROM  (
   SELECT *
   FROM   "Roles"         AS r  
   JOIN   "Users"         AS u  ON u."RoleId" = r."Id"
   JOIN   "PaymentOrders" AS po ON po."UserId" = u."Id"
   JOIN   "Payments"      AS p  ON p."PaymentOrderId" = po."Id"
   WHERE  r."Name" = 'Moses'
  ) x
LIMIT  1000;

As per comment: @Davita tested and ruled out this workaround. @Kevin's answer later clarified why the workaround failed: use a CTE instead of the subquery.
Or check for existence of a role, before you employ the big query to eliminate the bad case.

This leaves questions for PostgreSQL concerning the optimization of queries with LIMIT.

There have been a number of recent bug reports concerning query plans with LIMIT. I quote Simon Riggs commenting on one of these reports here:

Very bad plans with LIMIT are frequent. This is bad for us because adding LIMIT usually/is supposed to make queries faster, not slower.

We need to do something.

First attempt with no success

I missed that @Craig already mentioned join_collapse_limit in the comments. So that was of limited use:

Does reordering the JOIN clauses have any effect?

SELECT *
FROM   "Roles"         AS r  
JOIN   "Users"         AS u  ON u."RoleId" = r."Id"
JOIN   "PaymentOrders" AS po ON po."UserId" = u."Id"
JOIN   "Payments"      AS p  ON p."PaymentOrderId" = po."Id"
WHERE  r."Name" = 'Moses'
LIMIT  1000

Related: you did not by chance mess with the setting of join_collapse_limit or geqo_threshold? Very low setting might prevent the planner from reordering your JOIN clauses, which might explain your problem.

If that does not solve the case, I would try to create an index on "Roles"(Name). Not that this makes any sense with only 15 rows, but I would try to eliminate the suspicion that invalid statistics or cost parameters (or even a bug) make the planner believe the sequential scan on "Roles" to be more expensive than it is.

Community
  • 1
  • 1
Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
  • 1
    If changing the join order changed something I would consider this a bug in PostgreSQL. –  Nov 16 '12 at 11:50
  • Thanks Erwin, I tried your suggestions, I changed join_collapse_limit to 16 and then 64 but I've got same result. I tried your query and nothing changed. I added a B-TREE index on Roles.Name but no change :| I'm really confused here. – Davita Nov 16 '12 at 12:02
  • @a_horse_with_no_name: With a small number of tables the planner reorders JOINs to fit its best estimate. But the optimal order of JOINs is an O(n!) problem. So there is a limit after which more generic attempt on optimization are used. Of course, this shouldn't be a problem at all for just 4 tables. Query execution obviously starts at the wrong end here, therefore my shot in the dark at `join_collapse_limit` - and the now added `geqo_threshold`. – Erwin Brandstetter Nov 16 '12 at 12:02
  • @Davita: Sorry, at least it rules out some possible (theoretical) causes. Did you look at `geqo_threshold`, too? (Added later.) – Erwin Brandstetter Nov 16 '12 at 12:04
  • @ErwinBrandstetter thanks again Erwin, I just tried setting geqo_threshold(though I've no idea what it is :D) to 40 but still same query plan. – Davita Nov 16 '12 at 12:08
  • Next idea: `VACUUM FULL ANALYZE "Role"` If that doesn't do anything, I'd try `VACUUM FULL ANALYZE` on the database - will be expensive with millions of rows! So maybe at off hours if you have concurrent load. But these are all just shots in the dark .. – Erwin Brandstetter Nov 16 '12 at 12:16
  • @ErwinBrandstetter thanks Erwin, I already tried VACUUM FULL ANALYZE and it didn't help. I also re-analyzed Role table after adding index. – Davita Nov 16 '12 at 12:22
  • Per [documentation](http://www.postgresql.org/docs/current/interactive/explicit-joins.html), explicit join syntax forces PostgreSQL to join tables in the specified order. So if original query and the one suggested by Erwin produces different plans, something is wrong here. – vyegorov Nov 16 '12 at 12:32
  • @vyegorov: Explicit join syntax *only* forces the order when `join_collapse_limit` is lower than the number of tables. You meant that, right? – Erwin Brandstetter Nov 16 '12 at 12:40
  • @ErwinBrandstetter you were right, LIMIT was the cause of an incorrect query plan. I removed LIMIT and everything runs perfectly :) Thank you very much. Could you please update your post so I can accept it and others will find it useful? Thanks again :) – Davita Nov 16 '12 at 12:40
  • Wonderful, thank you very much for definitive answer. Just one note, by pushing the query in a subquery doesn't solve the issue, looks like this is a big problem in PostgreSQL, but still, checking the existence first is definitely a fix. Maybe someone can give me a hint where can I report this issue (if it is considered as an issue at all?). Thank you very much Erwin. – Davita Nov 16 '12 at 12:54
  • @Davita: You have been very cooperative, you are very welcome. You might report this to `pgsql-bugs@postgresql.org` Or use the form provided [here](http://www.postgresql.org/support/submitbug/). More options: http://www.postgresql.org/community/lists/ I suggest you link to this page for context. I remember having similar problems in the past, may be a long-standing issue .. – Erwin Brandstetter Nov 16 '12 at 13:02
  • @ErwinBrandstetter Thanks again, I reported the issue :) – Davita Nov 16 '12 at 13:34
  • @Davita: maybe add a link to the post to the question - once you have it. – Erwin Brandstetter Nov 16 '12 at 13:38
  • @ErwinBrandstetter yes, I sent them the link to this thread and gave them lots of explanation :) – Davita Nov 16 '12 at 13:42
  • @Davita: I mean the other way round: add this link http://archives.postgresql.org/pgsql-bugs/2012-11/msg00089.php to your question. :) – Erwin Brandstetter Nov 16 '12 at 13:44
  • @ErwinBrandstetter sorry, my bad :)) – Davita Nov 16 '12 at 13:51
  • @Davita: I added a link to similar bug reports to my answer. – Erwin Brandstetter Nov 16 '12 at 13:52
  • @ErwinBrandstetter Ouch, there were already many reports, hope one more won't hurt :) – Davita Nov 16 '12 at 13:56
  • @Davita: As I wrote: `may be a long-standing issue`... A reminder won't hurt. Your case is particularly clear-cut. – Erwin Brandstetter Nov 16 '12 at 13:59