0

I have 3 nodes Postgres cluster managed by patroni, Whenever my master node goes down and joins back as a replica, then the old master facing the below error:

2021-06-25T00:16:29.856133+00:00 host0 postgres_0[14131]: [7-2] #011DETAIL:  This server's history forked from timeline 1 at 0/5208600.
2021-06-25T00:16:29.862228+00:00 host0 postgres_0[112]: [4855-1] pid=112,session=60d4c1b1.70,line=4850,sqlstate=00000,user_app=,user=,db=,client=,txId=0 LOG:  new timeline 2 forked off current database system timeline 1 before current recovery point 0/60000A0
2021-06-25T00:16:34.857325+00:00 host0 postgres_0[14141]: [7-1] pid=14141,session=60d52062.373d,line=1,sqlstate=XX000,user_app=,user=,db=,client=,txId=0 FATAL:  could not start WAL streaming: ERROR:  requested starting point 0/6000000 on timeline 1 is not in this server's history

Here is the configuration I have used:

"hot_standby":                                "on",
"wal_log_hints":                              "on",
"restore_command":                            "cp /bp2/wal/psql/wal_archive/%f %p",
"archive_mode":                               "on",
"archive_command":                            "mkdir -p /bp2/wal/psql/wal_archive && test ! -f /bp2/wal/psql/wal_archive/%f && cp %p /bp2/wal/psql/wal_archive/%f",
"remove_data_directory_on_diverged_timelines":"true",
"remove_data_directory_on_rewind_failure":    "true",
"use_pg_rewind":                              "true",
"recovery_target_timeline":                   "latest"

Have already tried storing the WAL logs of all the node to a shared directory and then restoring it from there.

But the error is still the same.

Emon46
  • 1,506
  • 7
  • 14
Dushyant Sapra
  • 585
  • 1
  • 8
  • 16

1 Answers1

0

I think what you need to do is, use pg_rewind command manually in you new replica. What happened is that the old master is in a different wal position than the new master when the timeline changed from 1 to 2.

pg_rewind --source-server "user=<user> password=<user_password> host=<dns_or_ip_address> port=<server_port>" --target-pgdata <data-directory>

Ref

Emon46
  • 1,506
  • 7
  • 14
  • pg_rewind needs server to be stopped, that is not what I want. – Dushyant Sapra Jul 06 '21 at 06:27
  • why don't you use `pg_ctl -m fast -w stop` ? i mean it will work as the postgres server is not used as the main process in patroni so i don't think you will face any sort's of problem. – Emon46 Jul 06 '21 at 08:52