I'm building an application which uses two different SQL queries. I'm running EXPLAIN ANALYZE
on these two queries on the same Postgres table to understand their performance.
Query 1: Xmin query
explain analyze
select * from table where xmin::text::bigint >= xmin_max_value;
Output:
Seq Scan on users (cost=0.00..91302235.56 rows=666658645 width=141) (actual time=1686004.249..1686004.250 rows=0 loops=1)
Filter: (((xmin)::text)::bigint >= xmin_max_value)
Rows Removed by Filter: 2000000000
Planning Time: 3.066 ms
Execution Time: 1686004.308 ms
Query 2: Unindexed field query
explain analyze
select * from "2b_users".users where age > max_age;
Result:
Gather (cost=1000.00..56720318.43 rows=1 width=141) (actual time=545081.498..545083.143 rows=0 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Parallel Seq Scan on users (cost=0.00..56719318.33 rows=1 width=141) (actual time=545071.505..545071.505 rows=0 loops=3)
Filter: (age > 100)
Rows Removed by Filter: 666666667
Planning Time: 0.153 ms
Execution Time: 545083.178 ms`
I have set these queries up such that each query is expected to return no data. This is because the cursor value I've chosen in each case represents the max cursor value associated with that field.
I have a couple of questions about this behavior:
- How does the Postgres query planner determine whether to do a parallel sequential scan or not? I would have expected both of these queries to have the same plan, as they are both unindexed fields.
- I understand that
xmin
is a system column. Is there something about system-level columns that prevent the query planner from performing a parallel scan? Nothing in the documentation can explain why this should be a sequential scan : https://www.postgresql.org/docs/current/parallel-plans.html - Is there a way/setting to turn on parallel scans for the
xmin
query? Parallel scans are much faster, so ideally I'd like this query to also leverage parallel scans.