Is there any way we could stop replication without logging into psql shell. Disk-full situation lead to some corruption in PG files and keep on restarting.
2023-02-06 08:17:54 UTC [1] LOG: starting PostgreSQL 13.7 (Ubuntu 13.7-1.pgdg20.04+1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0, 64-bit
2023-02-06 08:17:54 UTC [1] LOG: listening on IPv4 address "0.0.0.0", port 5432
2023-02-06 08:17:54 UTC [1] LOG: listening on IPv6 address "::", port 5432
2023-02-06 08:17:54 UTC [1] LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2023-02-06 08:17:54 UTC [8] LOG: database system was shut down at 2023-02-06 08:17:45 UTC
2023-02-06 08:17:54 UTC [8] PANIC: could not open file "pg_replslot/slot_name/state": No such file or directory
2023-02-06 08:17:55 UTC [1] LOG: startup process (PID 8) was terminated by signal 6: Aborted
2023-02-06 08:17:55 UTC [1] LOG: aborting startup due to startup process failure
2023-02-06 08:17:55 UTC [1] LOG: database system is shut down
Tried removing pg_replslot/slot_name
which lead to "password auth failure" and After resetting DB password(via pg_hba.conf) DB is not showing up !
Is there any proper way to recover in this state? /pg/main files and pgdata directories seem to be available except this slot information.
Done below steps:
- I'm using PSQL docker container.
- disk used for PG got full. Cleaned up some log files and
docker system prune
was used to remove unused images which freed some space. But lead to this issue. - Multiple times, we have seen similar issue in Dev environments, Disk full leading to some corrupted files (unable to read/ No such file or directory) kind of errors.
- Tried removing
pg_replslot/slot_name
directory and it allowed me to start PSQL container.(previously is was keep on restarting container) - Reset password by using
trust
in auth column in pg_hbda.conf. - Now
\l
in psql shell showing only postgres DB and default DB's. Not showing our custom DB. - We have main DB in a separate tablespace and is not showing up in the list.
_ MOST importantly, Standby is also having SAME errors ! Probably someone messed it?