1

I upgraded my postgresql from 9.5 to 9.6 in order to use the parallel execution to improve my performance. However, I didnt succeed to use it. In my main database almost all my selects look like : select * from foreign_table The foreign table is a foreign table that is located on an oracle database. Some of the tables are big 10G+ and 1,000,000+ records so parallel query should help me alot in this case of select.

The parameters that i configured :

min_parallel_relation_size = 200MB
max_parallel_workers_per_gather = 2 
max_worker_processes = 8 

When I try to use explain analyze select * from a big table that her size is 1.5G and has 5,000,000 records I see only foreign scan :

 Foreign Scan on customer_prod  (cost=10000.00..20000.00 rows=1000 
width=2438) (actual time=6.337..231408.085 rows=5770616 loops
=1)
  Oracle query: ......
 Planning time: 2.827 ms
 Execution time: 232198.137 ms

*I also tried select * from foreign_table where 1=1 but still same result.

On the other hand the next code worked :

   postgres=# CREATE TABLE people_mariel_test (id int PRIMARY KEY NOT NULL, age int NOT NULL);
   CREATE TABLE
   postgres=# INSERT INTO people_mariel_test  SELECT id, (random()*100)::integer AS age FROM generate_series(1,10000000) AS id;
   INSERT 0 10000000
   postgres=# explain analyze select * from people_mariel_test where age=6;
                                                            QUERY PLAN

   --------------------------------------------------------------------------------
   ------------------------------------------------
   ----------
   Gather  (cost=1000.00..123777.76 rows=50000 width=8) (actual        time=0.239..771.801 rows=99409 loops=1)
      Workers Planned: 1
      Workers Launched: 1
      ->  Parallel Seq Scan on people_mariel_test  (cost=0.00..117777.76 rows=29412 width=8) (actual time=0.045..748.213 rows=49704
    loops=2)
     Filter: (age = 6)
     Rows Removed by Filter: 4950296
    Planning time: 0.261 ms
    Execution time: 785.924 ms
   (8 rows)

Any idea how can I continue?

Vao Tsun
  • 47,234
  • 13
  • 100
  • 132
JeyJ
  • 3,582
  • 4
  • 35
  • 83
  • add `ora_fdw` tag and hope to get Laurenz attention. But I very much doubt that postgres parallel execution setting will not affect the extension – Vao Tsun Aug 02 '17 at 12:07

1 Answers1

2

From documentation:

A ForeignScan node can, optionally, support parallel execution. A parallel ForeignScan will be executed in multiple processes and should return each row only once across all cooperating processes. To do this, processes can coordinate through fixed size chunks of dynamic shared memory. This shared memory is not guaranteed to be mapped at the same address in every process, so pointers may not be used. The following callbacks are all optional in general, but required if parallel execution is to be supported.

I have searched source code for Oracle FDW by Laurenz Albe and it does not implement IsForeignScanParallelSafe and thus cannot use parallel execution.

Łukasz Kamiński
  • 5,630
  • 1
  • 19
  • 32
  • Thanks Lukasz. Shit, Is there a way I can improve performance if that is the case ? – JeyJ Aug 02 '17 at 12:52
  • Depends on what you are doing with that data. Can it be processed individually (in partitions)? Can you make actual partitions on Oracle server? You can try some [alternate parallel extensions](https://pgxn.org/dist/pmpp/doc/pmpp.html) or make your own. – Łukasz Kamiński Aug 02 '17 at 13:11
  • Hi, I cant make partitions of those tables in oracle server. When you wrote "make your own" you meant to write an extension ? In general, I`m only selecting * from 70 tables in oracle server into local postgresql tables. – JeyJ Aug 02 '17 at 13:14
  • Not necessarily extension, more like a mechanism/system for parallel execution, like a thread manager of sorts. We have that kind of thing where I work but it is too complicated for StackOverflow answer and I wouldn't be able to share it anyway. If you can't make partitions on Oracle server then you will also have to figure out how to efficiently split that data - that alone would be lengthy read as there are multitude of approaches to this issue and all of them have their cons and pros (sadly I can't find that article now). – Łukasz Kamiński Aug 02 '17 at 13:22