1

I have workloads that have heavy schema changes and other ETL operations that are locking.

Before doing schema changes on my primary table, I would like to first copy the existing contents from the primary table on to a temporary table, then perform the schema change, then sync all new changes and once the "time is right" (cutoff?), do the cut over and have the temporary table become the primary table.

I know that I can use Triggers in postgres to sync data between two tables, and also use COPY to copy data from one table to another.

But I am not sure how can I can copy existing data first, then issue trigger to ensure no data is lost. Then also do the cut off so that the new table is primary.

What I am thinking is -

  • I issue a COPY table from primary table (TableA) to temp table TableB.
  • I then perform the schema change in TableB
  • I then setup Trigger from TableA to TableB for INSERT/UPDATE/DELETE
  • ... Now I am not sure how can I cut off so TableB becomes TableA. I can use RENAME perhaps?

It feels like I can run into some lost changes between Step 1 and Step 2?

Basically I am trying to ensure no data between the three high level operations. Is there a better way to do this?

johncssjs
  • 61
  • 7
  • You will probably have to use "logical decoding" for that. See for example the implementation of `pg_squeeze`. – Laurenz Albe Nov 12 '21 at 05:35
  • Thank you! It has good inspirations. I was looking at the design of pg_repack as well which uses triggers. – johncssjs Nov 28 '21 at 14:38
  • Do you know if such a thing needs to be an extension or can it live as a small lib and instruct PG via sql statements and poll where necessary ? – johncssjs Nov 28 '21 at 15:26
  • How you package the code is your choice and independent. Extensions are easy to create and simple to use. – Laurenz Albe Nov 28 '21 at 20:37

0 Answers0