Given the following tables
buckets
metric_id|start_date |bucket
------------------------------------
a |2019-12-05 00:00:00|1
a |2019-12-06 00:00:00|2
b |2021-10-31 00:00:00|1
b |2021-11-01 00:00:00|2
points
point_id|metric_id|timestamp
----------------------------
1 |a |2019-12-05 00:00:00
2 |a |2019-12-06 00:00:00
3 |b |2021-10-31 00:00:00
4 |b |2021-11-01 00:00:00
And the following query
select
p.metric_id,
bucket
from points p
left join width_bucket(p.timestamp, (select array(select start_date
from buckets b
where b.metric_id = p.metric_id -- correlated sub-query
))) as bucket on true
Output
metric_id|bucket
-----------------
a |1
a |2
b |1
b |2
How can I remove the correlated sub-query to improve the performance?
Currently ~280,000 points * ~650 buckets = ~180,000,000 loops = very slow!
Basically I want to remove the correlated sub-query and apply the width_bucket function only once per unique metric_id in buckets, so that the performance is improved and the function is still given the correct time series data.
How can this be done in Postgres 13?