I am trying to set up synchronous replication with Postgres 9.1, but I cannot get it to work. I was able to configure streaming replication, but not synchronous. I hope I have not missed anything obvious. I have read carefully many sections of ch 17, 18, 14, 25, 26 and 29 in the admin guide.
I am running ubuntu 12.04 and my master postgresql.conf has these, among all the other standard settings:
listen_addresses = '*' # what IP address(es) to listen on;
wal_level = archive # minimal, archive, or hot_standby
archive_mode = on # allows archiving to be done
archive_command = 'test ! -f /data/pgWalArchive/%f && cp %p /data/pgWalArchive/%f'
wal_keep_segments = 100 # in logfile segments, 16MB each; 0 disables ??? What should this be ????
max_wal_senders = 3 # max number of walsender processes
My pg_hba.conf has this, in addition to the standard stuff:
host all all XX.6.35.0/24 md5
host replication postgres XX.6.35.0/24 md5
My master db has just one sequence, so it is tiny. I successfully created a backup of the master in the primary and restored it:
sudo -u postgres pg_basebackup -D ~/backup -F tar -x -z -l ~/backup/base1 -v -h XX.6.35.51 -U postgres
I also copied the WAL archive files to the standby. My standby recovery.conf file has this:
restore_command = '/usr/lib/postgresql/9.1/bin/pg_standby /data/pgWalArchive %f %p %r'
archive_cleanup_command = '/usr/lib/postgresql/9.1/bin/pg_archivecleanup /data/pgWalArchive %r'
standby_mode = on
primary_conninfo = 'host=XX.6.35.51 port=5432' # e.g. 'host=masterIpAddressOrName port=5432'
Both servers start up with no problems and the logs seem ok. My standby has this:
2012-06-08 10:23:51 MDT LOG: shutting down
2012-06-08 10:23:51 MDT LOG: database system is shut down
2012-06-08 10:23:53 MDT LOG: database system was shut down in recovery at 2012-06-08 10:23:51 MDT
2012-06-08 10:23:53 MDT LOG: entering standby mode
2012-06-08 10:23:53 MDT LOG: consistent recovery state reached at 0/1D000078
2012-06-08 10:23:53 MDT LOG: record with zero length at 0/1D000078
2012-06-08 10:23:53 MDT LOG: streaming replication successfully connected to primary
2012-06-08 10:23:53 MDT LOG: incomplete startup packet
2012-06-08 10:23:54 MDT FATAL: the database system is starting up
2012-06-08 10:23:54 MDT FATAL: the database system is starting up
2012-06-08 10:23:55 MDT FATAL: the database system is starting up
2012-06-08 10:23:55 MDT FATAL: the database system is starting up
2012-06-08 10:23:56 MDT FATAL: the database system is starting up
2012-06-08 10:23:56 MDT FATAL: the database system is starting up
2012-06-08 10:23:57 MDT FATAL: the database system is starting up
2012-06-08 10:23:57 MDT FATAL: the database system is starting up
2012-06-08 10:23:58 MDT FATAL: the database system is starting up
2012-06-08 10:23:58 MDT FATAL: the database system is starting up
2012-06-08 10:23:59 MDT FATAL: the database system is starting up
2012-06-08 10:23:59 MDT LOG: incomplete startup packet
2012-06-08 10:24:40 MDT LOG: redo starts at 0/1D000078
The problem is that when I issue statements against the master, they hang forever. Am I missing something?