2

Below is a cut down version of the code I'm trying to run:

CREATE OR REPLACE VIEW Union_Test AS
(
    SELECT 
          C.Id
        , C.Code
        , C.Name
        , S.Id AS Status_Id 
    FROM 
          Client C 
          INNER JOIN 
          Status S ON S.Id = 1

    UNION ALL

    SELECT 
          CA.Id
        , CA.Code
        , CA.Name
        , S.Id AS Status_Id 
    FROM 
          Client_Archive CA 
          INNER JOIN 
          Status S ON S.Id = 2
);       

Rather than have soft deletes and 1 table (Client), when a client is deleted, that client is moved into the Client_Archive table. The above view is used to bring back the entire set of clients. I want to keep the tables relatively clean so add the Status_Id into the view rather than the tables.

The following code is then executed. This isn't quite what I do as I actually bring back the Page Number based on a Page Size variable but the below code is fine for this question.

SELECT 
      Id
    , ROW_NUMBER() OVER (ORDER BY Code) AS Rownum 
FROM 
      Union_Test

The Client (1,000,000 rows) and Client_Archive (1 row) tables have btree indexes on Id(PK), Code and Name The Status table has btree indexes on Id(PK) and Code. These were added as sanity checks as the Status table only has 5 rows.

The above query generates the following plan:

|QUERY PLAN                                                                                          |
|----------------------------------------------------------------------------------------------------|
|WindowAgg  (cost=165869.97..183369.97 rows=1000000 width=45) (actual time=8908.954..9490.758 rows=10|
|  ->  Sort  (cost=165869.97..168369.97 rows=1000000 width=37) (actual time=8908.933..9062.302 rows=1|
|        Sort Key: "*SELECT* 1".code                                                                 |
|        Sort Method: quicksort  Memory: 102702kB                                                    |
|        ->  Result  (cost=0.00..66212.13 rows=1000000 width=37) (actual time=0.027..604.587 rows=100|
|              ->  Append  (cost=0.00..56212.12 rows=1000000 width=37) (actual time=0.027..495.570 ro|
|                    ->  Subquery Scan on "*SELECT* 1"  (cost=0.00..51210.03 rows=999999 width=37) (a|
|                          ->  Nested Loop  (cost=0.00..41210.04 rows=999999 width=159) (actual time=|
|                                ->  Seq Scan on status s  (cost=0.00..1.06 rows=1 width=0) (actual t|
|                                      Filter: (id = 1)                                              |
|                                      Rows Removed by Filter: 4                                     |
|                                ->  Seq Scan on client c  (cost=0.00..31208.99 rows=999999 width=37)|
|                    ->  Subquery Scan on "*SELECT* 2"  (cost=0.00..2.09 rows=1 width=37) (actual tim|
|                          ->  Nested Loop  (cost=0.00..2.08 rows=1 width=159) (actual time=0.033..0.|
|                                ->  Seq Scan on client_archive ca  (cost=0.00..1.01 rows=1 width=37)|
|                                ->  Seq Scan on status s_1  (cost=0.00..1.06 rows=1 width=0) (actual|
|                                      Filter: (id = 2)                                              |
|                                      Rows Removed by Filter: 4                                     |
|Planning Time: 0.299 ms                                                                             |
|Execution Time: 9553.810 ms                                                                         |

There are no indexes used in the query and it takes nearly 10 secs to execute.

If I remove the UNION ALL and the Client_Archive portion of the query as below:

CREATE OR REPLACE VIEW Union_Test AS
(
    SELECT 
          C.Id
        , C.Code
        , C.Name
        , S.Id AS Status_Id 
    FROM 
          Client C 
          INNER JOIN 
          Status S ON S.Id = 1    
);       

I get the following plan when I run the Rownum query above:

|QUERY PLAN                                                                                          |
|----------------------------------------------------------------------------------------------------|
|WindowAgg  (cost=0.42..156177.45 rows=999999 width=45) (actual time=0.048..2059.087 rows=999999 loop|
|  ->  Nested Loop  (cost=0.42..141177.46 rows=999999 width=37) (actual time=0.038..1355.683 rows=999|
|        ->  Index Scan using client_code_idx on client c  (cost=0.42..128676.41 rows=999999 width=37|
|        ->  Materialize  (cost=0.00..1.07 rows=1 width=0) (actual time=0.000..0.000 rows=1 loops=999|
|              ->  Seq Scan on status s  (cost=0.00..1.06 rows=1 width=0) (actual time=0.012..0.013 r|
|                    Filter: (id = 1)                                                                |
|                    Rows Removed by Filter: 4                                                       |
|Planning Time: 0.213 ms                                                                             |
|Execution Time: 2125.573 ms                                                                         |

This is more the sort of plan I would expect given the indexes on the tables.

If I remove the joins to the Status table and reinstate the UNION ALL and Client_Archive portion of the query as below:

CREATE OR REPLACE VIEW Union_Test AS
(
    SELECT 
          C.Id
        , C.Code
        , C.Name
        , 1 AS Status_Id 
    FROM 
          Client C 

    UNION ALL

    SELECT 
          CA.Id
        , CA.Code
        , CA.Name
        , 2 AS Status_Id 
    FROM 
          Client_Archive CA 
);       

I get the following plan when I run the Rownum query above:

|QUERY PLAN                                                                                          |
|----------------------------------------------------------------------------------------------------|
|WindowAgg  (cost=0.56..163684.56 rows=1000000 width=45) (actual time=0.032..2579.727 rows=1000000 lo|
|  ->  Result  (cost=0.56..148684.56 rows=1000000 width=37) (actual time=0.023..1840.123 rows=1000000|
|        ->  Merge Append  (cost=0.56..138684.56 rows=1000000 width=37) (actual time=0.022..1694.317 |
|              Sort Key: c.code                                                                      |
|              ->  Index Scan using client_code_idx on client c  (cost=0.42..128676.41 rows=999999 wi|
|              ->  Index Scan using client_archive_code_idx on client_archive ca  (cost=0.13..8.14 ro|
|Planning Time: 0.176 ms                                                                             |
|Execution Time: 2647.764 ms                                                                         |

Again - pretty much as expected. It seems to be that the combination of UNION ALL and a JOIN in the view, and then ORDER BY in the query against the view, produces a sub-optimal plan. Or I'm missing something obvious. I'm relatively new to PostgreSQL so the latter is probably the case!

I can't add the Status in later once the data has been whittled down to the page I'm after as order by Status (obviously ordering is done by Code in the examples above) is one of the options offered to the end user.

I know I can just move everything into one Client table, or even order on a Status Id literal (1 or 2 in this case) without a join to the Status table - but this negates part of what I'm trying to do with the db design. I'm also not looking to create a materialized view with an index or create a Client_All table that holds the Client and Client_Archive data unless there is no other option.

Thanks, Rich.

Ron73
  • 21
  • 1
  • Is status a table? If so, what's the connection between client and status table? – zedfoxus Jun 11 '20 at 01:41
  • Yes - it's a table. The connection is added in the view via the Status Id. 1 is 'Active' and 2 is 'Archived' in this case. What I haven't shown is the Status Code and Status Description from the Status table. There will be a need to sort on Status Code in the future which is why there is a join to the Status table in the first place. – Ron73 Jun 11 '20 at 13:19

0 Answers0