0

I have two different environment Production (new) and pre-production (existing), We have given cluster ready with GP Installed on new prod environment.

I want to know what is the best way to move objects from Pre-Production Environment to Production Environment, I know:

  1. using gp_dump

  2. using pg_dump

  3. Manually dump each object (table ddl, functions ddl, view ddl, sequence ddl etc)

I want to know the best strategy and what are the pros and cons of each strategy, if only objects need to backup and restore from one environment to another.

Need your valuable input for the same.

Brian Tompsett - 汤莱恩
  • 5,753
  • 72
  • 57
  • 129

1 Answers1

0

The available strategies, ranged by priorities:

  1. Use gpcrondump and gpdbrestore. Will work only if the amount of segments in Pre-Production and Production are the same and the dbids are the same. The fastest way to transfer the whole database with schema as it would work as parallel dump and parallel restore. As it is the backup, it will lock pg_class for some short time, which might create some problems on Production system
  2. If the amount of objects to transfer is small, you can use gptransfer utility, see the user guide for reference. It provides you with an ability to transfer data directly between the segments of Pre-Production and Production. The requirement is that all the segment servers of the Pre-Production environment should be able to see all the segments from Production, which means the should be added to the same VLAN for data transfer.
  3. Write custom code and use writable external tables and readable external tables over the pipe object on the shared host. Also you would have to write some manual code to compare DDL. The benefit of this methout is that you can reuse the external tables to transfer the data between environments many times, and if DDL is not changed your transfer would be fast as the data won't be put to the disks. But all the data would be transferred through a single host, which might be a bottleneck (up to 10gbps transfer rate with dual 10GbE connections for the shared host). Another big advantage is that there would be no locks on the pg_class
  4. Run gpcrondump on the source system and restore the data serially on the target system. This is a way to go if you want to use backup-restore solution and your source and target systems have different amount of segments

In general, everything depends on what you want to achieve: move the objects a single time, move them once a month in a period of inactivity on the clusters, move all the objects weekly without stopping production, move the selected objects daily without stopping production, etc. The result would really depend on your needs

0x0FFF
  • 4,948
  • 3
  • 20
  • 26