I have problem with not using partition-wise join by PostgreSQL 15.2.
I use 2 tables range-partitioned by the same key: date_id. Partition ranges for both tables are equal, except one of them (date_dim) has much more partions.
Table DDL:
CREATE TABLE date_dim
(
date_id date NOT NULL,
month_key integer NOT NULL,
the_month smallint NOT NULL,
quarter_key smallint,
the_quarter smallint NOT NULL,
the_year smallint NOT NULL,
day_in_month smallint NOT NULL,
day_in_year smallint NOT NULL,
week_in_year smallint NOT NULL,
day_pl text COLLATE pg_catalog."default" NOT NULL,
day_en text COLLATE pg_catalog."default" NOT NULL,
month_pl text COLLATE pg_catalog."default" NOT NULL,
month_en text COLLATE pg_catalog."default" NOT NULL,
week_number smallint NOT NULL DEFAULT 0,
week_of_the_month smallint,
is_holiday smallint,
CONSTRAINT date_dim_pk PRIMARY KEY (date_id)
) PARTITION BY RANGE (date_id);
CREATE INDEX date_dim_date_id_idx ON date_dim USING btree (date_id ASC NULLS LAST);
CREATE INDEX date_dim_month_key_idx ON date_dim USING btree (month_key ASC NULLS LAST);
CREATE TABLE status_change_fact
(
status_change_dim_id bigint,
status_change_id integer,
date_id date,
project_dim_id bigint,
customer_dim_id bigint,
is_talk smallint,
downloaded_unique text COLLATE pg_catalog."default",
contacted_unique text COLLATE pg_catalog."default",
successes_unique text COLLATE pg_catalog."default",
closed_unique text COLLATE pg_catalog."default",
downloaded_nonunique smallint,
contacted_nonunique smallint,
successes_nonunique smallint,
closed_nonunique smallint,
dw_source_id integer,
count smallint DEFAULT 1,
call_junk_dim_id bigint,
closed_customer smallint
) PARTITION BY RANGE (date_id);
CREATE INDEX status_change_fact_date_id_project_dim_id_idx ON status_change_fact USING btree (date_id ASC NULLS LAST, project_dim_id ASC NULLS LAST);
CREATE INDEX status_change_fact_status_change_dim_id_date_id_idx ON status_change_fact USING btree (status_change_dim_id ASC NULLS LAST, date_id ASC NULLS LAST);
Most important partitions:
CREATE TABLE date_dim_202302 PARTITION OF date_dim FOR VALUES FROM ('2023-02-01') TO ('2023-03-01');
CREATE TABLE status_change_fact_202302 PARTITION OF status_change_fact FOR VALUES FROM ('2023-02-01') TO ('2023-03-01');
Simple query with join between both tables on partition key and filter on date_id using current_date function:
explain analyze
select *
FROM date_dim dd
left JOIN status_change_fact sf
on dd.date_id = sf.date_id
where dd.date_id BETWEEN
((current_date + (interval '1' day * floor(-3))) - ( EXTRACT( DAY FROM (current_date + (interval '1' day * floor(-3))) ) * INTERVAL '1 DAY' ) + INTERVAL '1 DAY') AND
((current_date + (interval '1' day * floor(-3))) - ( EXTRACT( DAY FROM (current_date + (interval '1' day * floor(-3))) ) * INTERVAL '1 DAY' ) + INTERVAL '1 DAY' + INTERVAL '1 MONTH' - INTERVAL '1 DAY');
where filter means between beginning of month which was 3 days ago and end of month which was 3 days ago (ie. whole February 2023 at the day of executing).
Explain results:
Merge Left Join (cost=73.95..2419690.04 rows=93583886 width=283) (actual time=10767.681..12547.594 rows=3226419 loops=1)
Merge Cond: (dd.date_id = sf.date_id)
-> Append (cost=68.93..1238.08 rows=698 width=156) (actual time=0.029..0.052 rows=28 loops=1)
Subplans Removed: 348
-> Index Scan using date_dim_202302_date_id_idx on date_dim_202302 dd_1 (cost=0.20..3.54 rows=2 width=156) (actual time=0.022..0.040 rows=28 loops=1)
Index Cond: ((date_id >= (((CURRENT_DATE + '-3 days'::interval) - ((EXTRACT(day FROM (CURRENT_DATE + '-3 days'::interval)))::double precision * '1 day'::interval)) + '1 day'::interval)) AND (date_id <= (((((CURRENT_DATE + '-3 days'::interval) - ((EXTRACT(day FROM (CURRENT_DATE + '-3 days'::interval)))::double precision * '1 day'::interval)) + '1 day'::interval) + '1 mon'::interval) - '1 day'::interval)))
-> Materialize (cost=5.02..1081729.10 rows=26814867 width=127) (actual time=0.012..10158.205 rows=26814487 loops=1)
-> Append (cost=5.02..1014691.93 rows=26814867 width=127) (actual time=0.010..7349.431 rows=26814487 loops=1)
-> Index Scan using status_change_fact_202201_date_id_project_dim_id_idx on status_change_fact_202201 sf_1 (cost=0.29..1950.21 rows=56988 width=130) (actual time=0.009..8.610 rows=56988 loops=1)
-> Index Scan using status_change_fact_202202_date_id_project_dim_id_idx on status_change_fact_202202 sf_2 (cost=0.28..301.32 rows=8117 width=130) (actual time=0.009..1.732 rows=8117 loops=1)
-> Index Scan using status_change_fact_202203_date_id_project_dim_id_idx on status_change_fact_202203 sf_3 (cost=0.28..223.19 rows=6143 width=130) (actual time=0.012..1.479 rows=6143 loops=1)
-> Index Scan using status_change_fact_202204_date_id_project_dim_id_idx on status_change_fact_202204 sf_4 (cost=0.29..658.95 rows=19063 width=130) (actual time=0.008..3.723 rows=19063 loops=1)
-> Index Scan using status_change_fact_202205_date_id_project_dim_id_idx on status_change_fact_202205 sf_5 (cost=0.29..356.18 rows=9941 width=130) (actual time=0.010..1.219 rows=9941 loops=1)
-> Index Scan using status_change_fact_202206_date_id_project_dim_id_idx on status_change_fact_202206 sf_6 (cost=0.29..1036.62 rows=30427 width=130) (actual time=0.006..3.503 rows=30427 loops=1)
-> Index Scan using status_change_fact_202207_date_id_project_dim_id_idx on status_change_fact_202207 sf_7 (cost=0.29..1034.62 rows=30809 width=130) (actual time=0.009..3.304 rows=30809 loops=1)
-> Index Scan using status_change_fact_202208_date_id_project_dim_id_idx on status_change_fact_202208 sf_8 (cost=0.28..193.90 rows=5356 width=130) (actual time=0.011..0.665 rows=5356 loops=1)
-> Index Scan using status_change_fact_202209_date_id_project_dim_id_idx on status_change_fact_202209 sf_9 (cost=0.43..148748.78 rows=4520518 width=130) (actual time=0.012..991.617 rows=4520518 loops=1)
-> Index Scan using status_change_fact_202210_date_id_project_dim_id_idx on status_change_fact_202210 sf_10 (cost=0.43..135721.97 rows=4112554 width=130) (actual time=0.015..1171.379 rows=4112554 loops=1)
-> Index Scan using status_change_fact_202211_date_id_project_dim_id_idx on status_change_fact_202211 sf_11 (cost=0.43..213329.82 rows=6588362 width=120) (actual time=0.012..1326.248 rows=6588362 loops=1)
-> Index Scan using status_change_fact_202212_date_id_project_dim_id_idx on status_change_fact_202212 sf_12 (cost=0.43..143825.93 rows=4350148 width=128) (actual time=0.010..1113.317 rows=4350148 loops=1)
-> Index Scan using status_change_fact_202301_date_id_project_dim_id_idx on status_change_fact_202301 sf_13 (cost=0.43..127917.02 rows=3849646 width=130) (actual time=0.011..885.968 rows=3849646 loops=1)
-> Index Scan using status_change_fact_202302_date_id_project_dim_id_idx on status_change_fact_202302 sf_14 (cost=0.43..105301.15 rows=3226415 width=130) (actual time=0.010..327.165 rows=3226415 loops=1)
-> Index Scan using status_change_fact_202303_date_id_project_dim_id_idx on status_change_fact_202303 sf_15 (cost=0.15..17.95 rows=380 width=186) (actual time=0.006..0.006 rows=0 loops=1)
Planning Time: 36.252 ms
Execution Time: 12674.066 ms
As you can see, PG uses Index Scan with no filters to read all status_change_fact partitions, but uses partition pruning for date_dim and read only one partition for february 2023.
Moreover PG does not even use proper index scan with filter on date_id which would limit table scans to only one partition of status_change_fact.
When current_date function is replaced by literal date:
explain analyze
select *
FROM date_dim dd
left JOIN status_change_fact sf
on dd.date_id = sf.date_id
where dd.date_id BETWEEN
(('2023-03-01'::date + (interval '1' day * floor(-3))) - ( EXTRACT( DAY FROM ('2023-03-01'::date + (interval '1' day * floor(-3))) ) * INTERVAL '1 DAY' ) + INTERVAL '1 DAY') AND
(('2023-03-01'::date + (interval '1' day * floor(-3))) - ( EXTRACT( DAY FROM ('2023-03-01'::date + (interval '1' day * floor(-3))) ) * INTERVAL '1 DAY' ) + INTERVAL '1 DAY' + INTERVAL '1 MONTH' - INTERVAL '1 DAY');
both tables are correctly pruned and partition-wise joined and the only one used partition of status_change_fact is even index filtered by date_id:
Nested Loop Left Join (cost=0.58..38401.46 rows=14666 width=283) (actual time=0.043..1070.607 rows=3226419 loops=1)
-> Index Scan using date_dim_202302_date_id_idx on date_dim_202302 dd (cost=0.15..3.49 rows=2 width=156) (actual time=0.021..0.034 rows=28 loops=1)
Index Cond: ((date_id >= '2023-02-01 00:00:00'::timestamp without time zone) AND (date_id <= '2023-02-28 00:00:00'::timestamp without time zone))
-> Index Scan using status_change_fact_202302_date_id_project_dim_id_idx on status_change_fact_202302 sf (cost=0.43..17854.65 rows=134434 width=130) (actual time=0.017..12.107 rows=115229 loops=28)
Index Cond: (date_id = dd.date_id)
Planning Time: 0.695 ms
Execution Time: 1162.486 ms
Execution time difference is also clrearly visible (12s vs 1s).
I know that current_date function is user time zone dependent, but according to documentation https://www.postgresql.org/docs/current/xfunc-volatility.html, it should be marked as stable, thus could be inlined and it looks like it is for date_dim at least. But PG knowing that only one partion of date_dim is read and the partitions of tables are partitioned by the same range could use this fact to prune status_change_fact too.
My question is: why does PostgreSQL do pruning only for date_dim in case of current_date usage, while pruning and partition-wise joining both tables when literal is used and how can we convince PostgreSQL to do it with current_date function? The query is generated by BI tool and we can not directly affect it so current_date has to stay.