2

My PostgreSQL Version 13. and below is parallel related parameters.

SELECT name, setting FROM pg_Settings WHERE name LIKE '%parallel%'

name                            |setting|
--------------------------------+-------+
enable_parallel_append          |on     |
enable_parallel_hash            |on     |
force_parallel_mode             |off    |
max_parallel_maintenance_workers|4      |
max_parallel_workers            |96     |
max_parallel_workers_per_gather |2      |
min_parallel_index_scan_size    |64     |
min_parallel_table_scan_size    |1024   |
parallel_leader_participation   |on     |
parallel_setup_cost             |1000   |
parallel_tuple_cost             |0.1    |

When I Run bleow Query, It works good. (only 3 secs)

EXPLAIN (analyze) 
SELECT   t1_code
        ,COUNT(1)  AS cnt
FROM     t1 a
WHERE    1=1
GROUP BY t1_code
      
      
    Finalize GroupAggregate  (cost=620185.13..620185.64 rows=2 width=12) (actual time=2953.797..3186.877 rows=2 loops=1)
      Group Key: t1_code
      ->  Gather Merge  (cost=620185.13..620185.60 rows=4 width=12) (actual time=2953.763..3186.835 rows=6 loops=1)
            Workers Planned: 2
            Workers Launched: 2
            ->  Sort  (cost=619185.11..619185.11 rows=2 width=12) (actual time=2926.805..2926.808 rows=2 loops=3)
                  Sort Key: t1_code
                  Sort Method: quicksort  Memory: 25kB
                  Worker 0:  Sort Method: quicksort  Memory: 25kB
                  Worker 1:  Sort Method: quicksort  Memory: 25kB
                  ->  Partial HashAggregate  (cost=619185.08..619185.10 rows=2 width=12) (actual time=2926.763..2926.768 rows=2 loops=3)
                        Group Key: t1_code
                        Batches: 1  Memory Usage: 24kB
                        Worker 0:  Batches: 1  Memory Usage: 24kB
                        Worker 1:  Batches: 1  Memory Usage: 24kB
                        ->  Parallel Seq Scan on t1 a  (cost=0.00..551015.72 rows=13633872 width=4) (actual time=0.017..1412.845 rows=10907098 loops=3)
    Planning Time: 1.295 ms
    JIT:
      Functions: 21
      Options: Inlining true, Optimization true, Expressions true, Deforming true
      Timing: Generation 2.595 ms, Inlining 156.371 ms, Optimization 112.165 ms, Emission 63.886 ms, Total 335.017 ms
    Execution Time: 3243.358 ms 

But, Without "EXPLAIN Analyze" That Query not use parallel process, when i see the pg_stat_activity. Only 1 process works. so Elapsed time is double. (6 secs)

T1 Table size is 3GB.

Thany for your help.

<< another test >> below is result of verbose, buffers option. In the same way before, without analyze, below query use single process. and orce_parallel_mode to ON does not effect.

EXPLAIN(ANALYZE, VERBOSE, BUFFERS)
SELECT COUNT(*) 
FROM (
     SELECT 
           ID
      FROM T2
     WHERE CODE1 <> '003'
       AND CODE2     <> 'Y'
       AND CODE3      <> 'Y'
     GROUP BY ID  
     ) t1 ;
      
Aggregate  (cost=204350.48..204350.49 rows=1 width=8) (actual time=2229.919..2229.997 rows=1 loops=1)
  Output: count(*)
  Buffers: shared hit=216 read=140248 dirtied=10
  I/O Timings: read=1404.532
  ->  Finalize HashAggregate  (cost=202326.98..203226.31 rows=89933 width=14) (actual time=2128.682..2199.811 rows=605244 loops=1)
        Output: T2.ID
        Group Key: T2.ID
        Batches: 1  Memory Usage: 53265kB
        Buffers: shared hit=216 read=140248 dirtied=10
        I/O Timings: read=1404.532
        ->  Gather  (cost=182991.39..201877.32 rows=179866 width=14) (actual time=1632.564..1817.564 rows=1019855 loops=1)
              Output: T2.ID
              Workers Planned: 2
              Workers Launched: 2
              Buffers: shared hit=216 read=140248 dirtied=10
              I/O Timings: read=1404.532
              ->  Partial HashAggregate  (cost=181991.39..182890.72 rows=89933 width=14) (actual time=1592.762..1643.902 rows=339952 loops=3)
                    Output: T2.ID
                    Group Key: T2.ID
                    Batches: 1  Memory Usage: 32785kB
                    Buffers: shared hit=216 read=140248 dirtied=10
                    I/O Timings: read=1404.532
                    Worker 0:  actual time=1572.928..1624.075 rows=327133 loops=1
                      Batches: 1  Memory Usage: 28689kB
                      JIT:
                        Functions: 8
                        Options: Inlining false, Optimization false, Expressions true, Deforming true
                        Timing: Generation 1.203 ms, Inlining 0.000 ms, Optimization 0.683 ms, Emission 9.159 ms, Total 11.046 ms
                      Buffers: shared hit=72 read=43679 dirtied=2
                      I/O Timings: read=470.405
                    Worker 1:  actual time=1573.005..1619.235 rows=330930 loops=1
                      Batches: 1  Memory Usage: 28689kB
                      JIT:
                        Functions: 8
                        Options: Inlining false, Optimization false, Expressions true, Deforming true
                        Timing: Generation 1.207 ms, Inlining 0.000 ms, Optimization 0.673 ms, Emission 9.169 ms, Total 11.049 ms
                      Buffers: shared hit=63 read=44135 dirtied=6
                      I/O Timings: read=460.591
                    ->  Parallel Seq Scan on T2  (cost=0.00..176869.37 rows=2048806 width=14) (actual time=10.934..1166.528 rows=1638627 loops=3)
                          Filter: (((T2.CODE1)::text <> '003'::text) AND ((T2.CODE2)::text <> 'Y'::text) AND ((T2.CODE3)::text <> 'Y'::text))
                          Rows Removed by Filter: 24943
                          Buffers: shared hit=216 read=140248 dirtied=10
                          I/O Timings: read=1404.532
                          Worker 0:  actual time=10.083..1162.319 rows=1533436 loops=1
                            Buffers: shared hit=72 read=43679 dirtied=2
                            I/O Timings: read=470.405
                          Worker 1:  actual time=10.083..1161.430 rows=1561181 loops=1
                            Buffers: shared hit=63 read=44135 dirtied=6
                            I/O Timings: read=460.591
Planning:
  Buffers: shared hit=70
Planning Time: 0.253 ms
JIT:
  Functions: 31
  Options: Inlining false, Optimization false, Expressions true, Deforming true
  Timing: Generation 4.451 ms, Inlining 0.000 ms, Optimization 2.182 ms, Emission 29.515 ms, Total 36.148 ms
Execution Time: 2234.037 ms
seankim
  • 83
  • 6
  • Could you use EXPLAIN(ANALYZE, VERBOSE, BUFFERS) to get some insights about IO and memory usage? And what happens when you set force_parallel_mode to ON ? – Frank Heikens Jun 10 '22 at 07:46
  • "max_parallel_workers |96" That seems a bit extreme. What is max_worker_processes? – jjanes Jun 10 '22 at 17:21
  • How many times did you run it each way, and was it 100% reliable on taking 2-3s vs 6s? Polling pg_stat_activity doesn't seem like the most reliable way to determine parallel status. Can you set up auto_explain, so you get a more direct report? – jjanes Jun 10 '22 at 17:25

2 Answers2

0

I solved the problem. This issue was a DBeaver issue. Using psql, parallel processing is performed, and DBeaver is treated as a single process. In DBeaver, parallel processing is performed by CTRL+ALT+Shift+A. However, when Ctrl+Enter is performed, it is treated as a Single process.

seankim
  • 83
  • 6
0

When you run explain analyze you are actually executing the query as if it were typed into psql. Additionally there are no results back to the user the results are piped into /dev/null. When you execute the query in DBeaver it is using the pgjdbc JDBC driver to execute the query and is likely setting the fetch size to a non-zero value which turns off parallel query. See https://www.postgresql.org/docs/current/when-can-parallel-query-be-used.html for details.

Dave Cramer
  • 76
  • 1
  • 4