I tried to use pglogical to replicate data between 2 PG servers and found huge disk occupation during the process. Here is the details
Env:
Source PG:
- server name: pg-publisher.
- IP address: 10.0.0.1
- Whole disk space: 3.5T
- Size of PG data folder: 2.2T
- Size of biggest table:
- table name ->
my_schema.logs
. - pg_table_size('logs') = 1.4T
- table name ->
- PG is running, new data keeps coming in and inserting into the tables including
logs
table.
Destination PG:
- server name: pg-subscriber
- IP address: 10.0.0.2
- Whole disk space: 3T
How did I setup the replication:
On pg-publisher
Create node:
SELECT pglogical.create_node( node_name := 'publisher', dsn := 'host=10.0.0.1 port=5432 dbname=mydb user=pguser password=pgpassword' );
Create replication_set
SELECT pglogical.create_replication_set('my_replication_set'); SELECT pglogical.replication_set_add_all_tables('my_replication_set', ARRAY['my_schema']);
On pg-subscriber
Create node
SELECT pglogical.create_node( node_name := 'subscriber', dsn := 'host=10.0.0.2 port=5432 dbname=mydb user=pguser password=pgpassword' );
create subscription
SELECT pglogical.create_subscription( subscription_name := 'pg_subscriber', provider_dsn := 'host=10.0.0.1 port=5432 dbname=mydb user=pguser password=pgpassword', replication_sets := ARRAY['my_replication_set'], synchronize_structure := false, synchronize_data := true
); ```
Verification
After above action. I checked the status on both servers:
- on pg-subscriber. The pglogical replication is in
initializing
state (frompglogical.show_subscription_status()
) - on pg-publisher. The replication slot is created and does not active. That's expected because there are lots of data need to be initialized. It will not be active until it's reach to
replicating
state
My confusion
All the actions seems work as expected. But after 4 days, I got alert and found the available disk space on pg-publisher
decreased a lot (decrease over 20%).
Then here is what I found:
pglogical
replication is still ininitializing
state onpg-subscriber
. On pg-publisher, I checked thepg_stat_activity
on it and foundpglogical
event is usingCOPY
command to copy the biggest tablemy_schema.logs
.Folder size of PG data on
pg-publisher
increased from 2.2T to 2.7T
I know the publisher server will retain the WAL files required for the replication before it's reach to replicating
state, that will occupy the disk space. But I checked the pg_wal
folder size on pg-publisher
and found it's only less than 20GB. But the base
folder increased to 2.7T. May I know why does it happen. Does COPY
command occupy the disk space(I don't think so because I found it will COPY to stdout) or is there another thing will cause it?