3

I was trying to upgrade from bitnami pg image 11 -> version 14. When trying to do so i was prompted with the following error:

The data directory was initialized by PostgreSQL version 11, which is not compatible with this version 14.0

In order to get around this, I created a new postgres deployment with a new PVC, used pgdump to take a backup of the data and imported it within the new postgres deployment which is running version 14.

However I'm going to need to repeat this process on larger databases which terabytes of data and don't think pgdump is going to be sufficient.

With the bitnami image is it possible to use the likes of pg_upgrade?

Nimantha
  • 6,405
  • 6
  • 28
  • 69
krobbo
  • 31
  • 2

1 Answers1

2

Yeah, backup and restoring the way you did is always a good option with new PVC and PV.

Pg_dump & Pg_restore is a robust native option i think you can use the -j to start the multiple threads to migrate the data.

To migrate the TB of data you might need good Network bandwidth and a scalable solution.

Not sure how you are running this instances or replicas.

You can do something like :

Create new helm release of Postgres while the old one running

Migrate the data

kubectl exec -it new-helm-db-postgresql-0 -- bash -c 'export PGPASSWORD=${POSTGRES_PASSWORD}; time pg_dump -h old-postgresql -U postgres | psql -U postgres'

As suggested above you can add the -j also to start multi threads however it will increase the resources of POD and Disk usage if migrating TB of data.

You can also refer : https://www.citusdata.com/blog/2021/02/20/faster-data-migrations-in-postgres/

i would like to suggest using the DMS AWS if you are on the managed cloud.

Or else setting up one VM with Migration Tool and migrating data to old postgres cluster to new.

Harsh Manvar
  • 27,020
  • 6
  • 48
  • 102
  • Thanks very much, unfortunately we are not using a helm release and are just deploying natively through kubectl. Using the multi threads on the pg_dump sounds very promising. For https://github.com/urbica/pg-migrate we would spin that up on an EC2 and mount both volumes old/new and copy the data? – krobbo Nov 09 '21 at 16:31
  • Okay, by EC2 mean using it as middleware instead of using the DMS. ec2 will contain the tool which will connect to source and destination and migrate the data. So you can keep big instance will good amount of resources so on that instace you run migration tool, or multi thread dump. – Harsh Manvar Nov 09 '21 at 16:42
  • Thanks for sharing the answer! It worked on my scenario where I dump all data from old PostgresSQL version Pod to a new version Pod. – Carl Tsai Jan 06 '22 at 02:39