0

I'd like to have some advice on the steps I have to follow to conduct a DR drill for below PostgreSQL setup.

Environment Setup

I have 2 PostgreSQL servers in my production environment which are clustered using pgpool. ( reference: https://www.pgpool.net/docs/42/en/html/example-cluster.html - I am using a scaled down setup with 2 nodes )

In my DR environment, I have a separate single PostgreSQL server which replicates data from node 1 in production environment. The WAL streaming is used to implement the replication from production's node 1 to DR.

postgresql version - 13 pgpool version - 4.2

enter image description here DR Drill Activity

Phase 1: In the failover phase of the DR drill, the production PostgreSQL servers will be stopped and traffic will be switched to DR environment.

Phase 2: In the fall back phase of the DR drill, the production PostgreSQL servers will be started back and traffic will be switched back to the production environment. Before switching back the traffic, the production PostgreSQL cluster should have updated data records.

Questions

Q1: During the failover phase of the DR drill, how can I enable DR's PostgreSQL server to accepts write requests? ( currently, the DR is only supporting read requests )

Q2: In the fallback phase, the production's PostgreSQL cluster should be restored with latest data from DR. How can I achieve this? I'm also doubtful on the high level steps that I should follow during the fallback phase.

Thanks in advance.

1 Answers1

1

To your first question: you promote the standby, either with pg_ctl promote, the pg_pronote database function or using a trigger file that you configured on the standby.

To your second question: you can try pg_rewind, and if that doesn't work, build a new standby from scratch using pg_basebackup.

Laurenz Albe
  • 209,280
  • 17
  • 206
  • 263
  • Thanks for sharing, I used 'pg_ctl promote' and afterwards my DR instance can accept write requests, in the failover stage. – Shan Siriwardana May 24 '22 at 10:51
  • After the fallback stage, I want the the VM-1 to function as the master back again and the VM-2 and VM-3 to act as a replicas. Considering this requirement, may I know if I can go ahead with pg_rewind? I'm also wondering if I should take any action in the fall back stage to disable/erase setting of the WAL streaming records [ VM1-VM2 and VM1-VM3 ] that used to exist before stopping 2 prod servers initially? Thank You – Shan Siriwardana May 24 '22 at 11:08
  • 1
    I can't give you a full tutorial on how to fail over PostgreSQL here. Yes, you can use `pg_rewind` (if you configured PostgreSQL correctly by activating `wal_log_hints`). I don't understand the second part of the question, but I guess that would be a different question anyway. – Laurenz Albe May 24 '22 at 12:06
  • Thanks a lot for your inputs, Let me try if I can work this out. – Shan Siriwardana May 24 '22 at 16:55