Below is the version of PostgreSQL used:
SELECT version();
version
-----------------------------------------------------------------------------------------------------------
PostgreSQL 9.6.22 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44), 64-bit
(1 row)
I am checking the performance of two queries given below, please note that the table "import_request" is partitioned by "enterpriseId" field.
- Using the base table "import_request" - lets call this "query1"
explain analyze
With distinct_import_req_details AS (
select distinct on (import_request_id) import_request.enterprise_id AS enterpriseId,
infrastructure_id AS infrastructureId,
import_request.error_code AS errorCode,
short_description AS shortDescription,
import_request_id,
DATE_PART('day', end_date - start_date) AS nu_ext_days
from import_request
JOIN infra_instan USING (infra_instan_id)
JOIN sub_infra USING (infra_instan_id)
JOIN project_sub USING (project_sub_id)
JOIN project USING (project_id)
JOIN organization USING (organization_id)
LEFT JOIN error_code ON import_request.error_code = tx_code
where import_request.enterprise_id = 133
and extraction_status / 1000 IN (8, 9)
and marked_for_deletion = false
and import_request.error_code is not null
and import_request.error_code != coalesce(infra_instan.error_code, '')
and retry_status = 1)
select enterpriseId,
infrastructureId,
errorCode,
shortDescription,
count(import_request_id) AS failedCount,
sum(nu_ext_days) AS total_ext_days
from distinct_import_req_details
group by enterpriseId, infrastructureId, errorCode, shortDescription
order by enterpriseId, infrastructureId, total_ext_days desc;
- Using the child table (partitioned) "import_request_133" - lets call this "query2"
explain analyze
With distinct_import_req_details AS (
select distinct on (import_request_id) import_table.enterprise_id AS enterpriseId,
infrastructure_id AS infrastructureId,
import_table.error_code AS errorCode,
short_description AS shortDescription,
import_request_id,
DATE_PART('day', end_date - start_date) AS nu_ext_days
from import_request_133 import_table
JOIN infra_instan USING (infra_instan_id)
JOIN sub_infra USING (infra_instan_id)
JOIN project_sub USING (project_sub_id)
JOIN project USING (project_id)
JOIN organization USING (organization_id)
LEFT JOIN error_code ON import_table.error_code = tx_code
where import_table.enterprise_id = 133
and extraction_status / 1000 IN (8, 9)
and marked_for_deletion = false
and import_table.error_code is not null
and import_table.error_code != coalesce(infra_instan.error_code, '')
and retry_status = 1)
select enterpriseId,
infrastructureId,
errorCode,
shortDescription,
count(import_request_id) AS failedCount,
sum(nu_ext_days) AS total_ext_days
from distinct_import_req_details
group by enterpriseId, infrastructureId, errorCode, shortDescription
order by enterpriseId, infrastructureId, total_ext_days desc;
When I am using the query1, the execution time is around 5.5 minutes and when query2 is used the the execution time is around 35 seconds, I checked the query planner for these and noticed that for the query1 and query2 the plan is different, I was thinking that once the child table is found the query plan should be same (or at-least similar) but this is not the case here. Because of this the execution time of query1 is too much compared to query2.
Plan for query1 : https://explain.depesz.com/s/0jtE
Plan for query2 : https://explain.depesz.com/s/TxWh
Has anyone noticed this before, if yes, can you please let me know why is the query plan changing for these queries and is there any fix as this could impact the execution time of other queries we write using the base table ?
Thanks for the support in advance.