0

Below is the version of PostgreSQL used:

SELECT version();
                                                  version
-----------------------------------------------------------------------------------------------------------
 PostgreSQL 9.6.22 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44), 64-bit
(1 row)

I am checking the performance of two queries given below, please note that the table "import_request" is partitioned by "enterpriseId" field.

  1. Using the base table "import_request" - lets call this "query1"
explain analyze
With distinct_import_req_details AS (
    select distinct on (import_request_id) import_request.enterprise_id               AS enterpriseId,
                                           infrastructure_id                             AS infrastructureId,
                                           import_request.error_code               AS errorCode,
                                           short_description                          AS shortDescription,
                                           import_request_id,
                                           DATE_PART('day', end_date - start_date) AS nu_ext_days
    from import_request
             JOIN infra_instan USING (infra_instan_id)
             JOIN sub_infra USING (infra_instan_id)
             JOIN project_sub USING (project_sub_id)
             JOIN project USING (project_id)
             JOIN organization USING (organization_id)
             LEFT JOIN error_code ON import_request.error_code = tx_code
    where import_request.enterprise_id = 133
      and extraction_status / 1000 IN (8, 9)
      and marked_for_deletion = false
      and import_request.error_code is not null
      and import_request.error_code != coalesce(infra_instan.error_code, '')
      and retry_status = 1)
select enterpriseId,
       infrastructureId,
       errorCode,
       shortDescription,
       count(import_request_id) AS failedCount,
       sum(nu_ext_days)         AS total_ext_days
from distinct_import_req_details
group by enterpriseId, infrastructureId, errorCode, shortDescription
order by enterpriseId, infrastructureId, total_ext_days desc;


  1. Using the child table (partitioned) "import_request_133" - lets call this "query2"
explain analyze
With distinct_import_req_details AS (
    select distinct on (import_request_id) import_table.enterprise_id              AS enterpriseId,
                                           infrastructure_id                       AS infrastructureId,
                                           import_table.error_code                 AS errorCode,
                                           short_description                       AS shortDescription,
                                           import_request_id,
                                           DATE_PART('day', end_date - start_date) AS nu_ext_days
    from import_request_133 import_table
             JOIN infra_instan USING (infra_instan_id)
             JOIN sub_infra USING (infra_instan_id)
             JOIN project_sub USING (project_sub_id)
             JOIN project USING (project_id)
             JOIN organization USING (organization_id)
             LEFT JOIN error_code ON import_table.error_code = tx_code
    where import_table.enterprise_id = 133
      and extraction_status / 1000 IN (8, 9)
      and marked_for_deletion = false
      and import_table.error_code is not null
      and import_table.error_code != coalesce(infra_instan.error_code, '')
      and retry_status = 1)
select enterpriseId,
       infrastructureId,
       errorCode,
       shortDescription,
       count(import_request_id) AS failedCount,
       sum(nu_ext_days)         AS total_ext_days
from distinct_import_req_details
group by enterpriseId, infrastructureId, errorCode, shortDescription
order by enterpriseId, infrastructureId, total_ext_days desc;

When I am using the query1, the execution time is around 5.5 minutes and when query2 is used the the execution time is around 35 seconds, I checked the query planner for these and noticed that for the query1 and query2 the plan is different, I was thinking that once the child table is found the query plan should be same (or at-least similar) but this is not the case here. Because of this the execution time of query1 is too much compared to query2.

Plan for query1 : https://explain.depesz.com/s/0jtE

Plan for query2 : https://explain.depesz.com/s/TxWh

Has anyone noticed this before, if yes, can you please let me know why is the query plan changing for these queries and is there any fix as this could impact the execution time of other queries we write using the base table ?

Thanks for the support in advance.

  • Sorry, but you should upgrade and use declarative partitioning. – Laurenz Albe Sep 14 '21 at 21:39
  • Analyze all the tables involved. Why aren't they already analyzed? Are any of these temp tables? – jjanes Sep 14 '21 at 22:44
  • @jjanes All the tables involved are already analysed and none of them are temp tables. – Siddalingaprasad R Sep 16 '21 at 15:13
  • It is hard to believe that "error_code" has been analyzed recently. That estimate is off by 88 fold, and there is nothing complicated about that estimate. I can see no reason for it to be off by that much, other than lousy stats. – jjanes Sep 16 '21 at 15:51

0 Answers0