0

I want to copy huge datasets between two postgres servers. Specifically I want to:

  1. copy some columns (not all) from two tables in source db server, to a single table in destination db server.
  2. rename those columns in destination table (as datasets are copies) as destination column name isn't same as origin column name (maybe I should say insert values in some_column_name in destination table).

I understand postgres_fdwcannot be used to achieve this task as it copies entire table to new one (not merging selected field to form a new table) and both origin and destination tables must be of matching column number.

My source db server is a remote I ssh, and the datasets to copy is about 5GB.

Any suggestion to achieve my task?

  • I would create a staging table and then use that for the fdw. That way you can select the data in the foreign server and write into this intermediary staging table, then read from it on the other server. – J Spratt May 26 '19 at 16:09
  • `postgres_fdw` is just a 'client inside the server'; you create a local definition of your remote table and then you treat it just like a local table, but the wrapper handles all the mapping to a remote connection. You can trivially use it to meet your requirements, just create a FDW server, user mapping and schema (use `IMPORT FOREIGN SCHEMA`), then just run standard sql inserts; e.g. `INSERT INTO remote_table(col1, col2) SELECT col_a, col_b FROM local_table`. – Martijn Pieters Apr 26 '23 at 10:52

1 Answers1

0

Idea:

  • Have one server (the host) write into a staging table (example below).
  • When the host server writes to the staging table it sets the created_ts.

  • When the foreign server reads the data successfully, it sets the xchange_ts.

Using foreign data wrapper you may not need the xchange_err field but the idea of that is to report a failed import.


CREATE TABLE staging
( id          SERIAL PRIMARY KEY
, created_ts  TIMESTAMPTZ
, xchange_ts  TIMESTAMPTZ
, xchange_err TEXT DEFAULT NULL
, colA        type_datatype
, colB        type_datatype
);

I actually haven't used the postgres_fdw but I imagine the foreign table would be defined something like this.

CREATE FOREIGN TABLE staging 
( id          INTEGER PRIMARY KEY
, created_ts  TIMESTAMPTZ
, xchange_ts  TIMESTAMPTZ
, xchange_err TEXT DEFAULT NULL
, colA        type_datatype
, colB        type_datatype
)
SERVER <server_name>
OPTIONS (schema_name '<schema_name>', table_name 'staging');
J Spratt
  • 1,762
  • 1
  • 11
  • 22
  • Wouldn't it be better to run some query that writes the result set of `select` columns from source database/table dumped to a file (say `csv`). Then copy the file to destination db and read into the final "single" file? – arilwan May 26 '19 at 17:25
  • 1
    You could do that. I went with this option because you set it up once and have the ability to use it for future imports. You could set up triggers to do the imports and exports automatically. – J Spratt May 26 '19 at 17:59