High Level DR Drill Steps for a PostgreSql clustered setup

Question

I'd like to have some advice on the steps I have to follow to conduct a DR drill for below PostgreSQL setup.

Environment Setup

I have 2 PostgreSQL servers in my production environment which are clustered using pgpool. ( reference: https://www.pgpool.net/docs/42/en/html/example-cluster.html - I am using a scaled down setup with 2 nodes )

In my DR environment, I have a separate single PostgreSQL server which replicates data from node 1 in production environment. The WAL streaming is used to implement the replication from production's node 1 to DR.

postgresql version - 13 pgpool version - 4.2

DR Drill Activity

Phase 1: In the failover phase of the DR drill, the production PostgreSQL servers will be stopped and traffic will be switched to DR environment.

Phase 2: In the fall back phase of the DR drill, the production PostgreSQL servers will be started back and traffic will be switched back to the production environment. Before switching back the traffic, the production PostgreSQL cluster should have updated data records.

Questions

Q1: During the failover phase of the DR drill, how can I enable DR's PostgreSQL server to accepts write requests? ( currently, the DR is only supporting read requests )

Q2: In the fallback phase, the production's PostgreSQL cluster should be restored with latest data from DR. How can I achieve this? I'm also doubtful on the high level steps that I should follow during the fallback phase.

Thanks in advance.

score 1 · Answer 1 · answered May 23 '22 at 05:57

1

To your first question: you promote the standby, either with pg_ctl promote, the pg_pronote database function or using a trigger file that you configured on the standby.

To your second question: you can try pg_rewind, and if that doesn't work, build a new standby from scratch using pg_basebackup.

answered May 23 '22 at 05:57

Laurenz Albe

209,280
17
206
263

Thanks for sharing, I used 'pg_ctl promote' and afterwards my DR instance can accept write requests, in the failover stage. – Shan Siriwardana May 24 '22 at 10:51
After the fallback stage, I want the the VM-1 to function as the master back again and the VM-2 and VM-3 to act as a replicas. Considering this requirement, may I know if I can go ahead with pg_rewind? I'm also wondering if I should take any action in the fall back stage to disable/erase setting of the WAL streaming records [ VM1-VM2 and VM1-VM3 ] that used to exist before stopping 2 prod servers initially? Thank You – Shan Siriwardana May 24 '22 at 11:08
1

I can't give you a full tutorial on how to fail over PostgreSQL here. Yes, you can use `pg_rewind` (if you configured PostgreSQL correctly by activating `wal_log_hints`). I don't understand the second part of the question, but I guess that would be a different question anyway. – Laurenz Albe May 24 '22 at 12:06
Thanks a lot for your inputs, Let me try if I can work this out. – Shan Siriwardana May 24 '22 at 16:55

High Level DR Drill Steps for a PostgreSql clustered setup

Environment Setup

DR Drill Activity

Questions

1 Answers1