Postgres closes connection during query after a few hundred seconds when using Psycopg2

Question

I'm running PostgreSQL 9.6 (in Docker, using the postgres:9.6.13 image) and psycopg2 2.8.2.

My PostgreSQL server (local) hosts two databases. My goal is to create materialized views in one of the databases that use data from the other database using Postgres's foreign data wrappers. I do all this from a Python script that uses psycopg2.

This works well as long as creating the materialized view does not take too long (i.e. if the amount of data being imported isn't too large). However, if the process takes longer than roughly ~250 seconds, psycopg2 throws the exception

psycopg2.OperationalError: server closed the connection unexpectedly
    This probably means the server terminated abnormally
    before or while processing the request.

No error message (or any message concerning this whatsoever) can be found in Postgres's logs.

Materialized view creation completes successfully if I do it from an SQL client (Postico).

This code illustrates roughly what I'm doing in the Python script:

db = pg.connect(
    dbname=config.db_name,
    user=config.db_user,
    password=config.db_password,
    host=config.db_host,
    port=config.db_port
)
with db.cursor() as c:
    c.execute("""
        CREATE EXTENSION IF NOT EXISTS postgres_fdw;
        CREATE SERVER fdw FOREIGN DATA WRAPPER postgres_fdw OPTIONS (...);
        CREATE USER MAPPING FOR CURRENT_USER SERVER fdw OPTIONS (...);
        CREATE SCHEMA foreign;
        IMPORT FOREIGN SCHEMA foreign_schema FROM SERVER fdw INTO foreign;
    """)
    c.execute("""
        CREATE MATERIALIZED VIEW IF NOT EXISTS my_view AS (
            SELECT (...)
            FROM foreign.foreign_table
        );
    """)

score 11 · Accepted Answer · answered May 27 '19 at 11:18

11

Adding the keepalive parameters to the psycopg2.connect call seems to have solved the problem:

self.db = pg.connect(
            dbname=config.db_name,
            user=config.db_user,
            password=config.db_password,
            host=config.db_host,
            port=config.db_port,
            keepalives=1,
            keepalives_idle=30,
            keepalives_interval=10,
            keepalives_count=5
        )

I still don't know why this is necessary. I can't find anyone else who has described having to use the keepalives parameter keywords when using Postgres in Docker just to be able to run queries that take longer than 4-5 minutes, but maybe it's obvious enough that nobody has noted it?

answered May 27 '19 at 11:18

haroba

2,120
4
22
37

According to https://www.postgresql.org/docs/current/libpq-connect.html the default value of `keepalives` is already 1. So the connection should be kept alive. But for my connection it has been closed by an `OperationalError`. – CMCDragonkai Apr 18 '20 at 06:11
Also you can use the other parameters can be set to `0` to use the system default. However I'm not sure where to find the default parameters. Maybe they are OS specific. – CMCDragonkai Apr 18 '20 at 06:12
1

I'm having this exact same issue right with exactly the same setup where I am trying to create multiple materialized views based using a FDW table. But after 5 minutes, the FDW connection gets closed. I tried the keepalives but it didn't seem to work for me. My postgres logs show that psycopg2 is aborting the transaction because the connection was lost to the fwd server. – krchun May 21 '20 at 07:09
I'm baffled by this, and would like to know why this works too. I created a question for it if any of you know the answer: https://stackoverflow.com/q/71343175/604048 – André C. Andersen Mar 03 '22 at 20:09

score 3 · Answer 2 · answered Mar 04 '22 at 07:26

We encountered the same issue, and resolved it by adding net.ipv4.tcp_keepalive_time=200 to our docker-compose.yml file:

services:
  myservice:
    image: myimage
    sysctls:
      - net.ipv4.tcp_keepalive_time=200

From what I understand this will signal that the connection is alive after 200 seconds, which is less than the time it takes to drop the connection (300 seconds?), thus preventing it from being dropped.

score 0 · Answer 3 · answered May 27 '19 at 08:27

0

It might be that PostgreSQL 9.6 kills your connections after the new timeout mentioned at https://stackoverflow.com/a/45627782/1587329. In that case, you could set

the statement_timeout in postgresql.conf

but it is not recommended.

It might work in Postico because the value has been set there.

To log an error you need to set log_min_error_statement to ERROR or lower for it to show.

answered May 27 '19 at 08:27

serv-inc

35,772
9
166
188

1

Hi, `idle_in_transaction_session_timeout` is already set to 0 (and so are all the other timeouts). `log_min_error_statement` is already set to `ERROR`. – haroba May 27 '19 at 08:40
just to make sure: `statement_timeout` is 0 as well? This seems to have been introduced in 9.6 – serv-inc May 27 '19 at 08:57
1

Yes, `statement_timeout` is zero. – haroba May 27 '19 at 09:18
Did you try to set a `connect_timeout` as mentioned in https://stackoverflow.com/questions/27641740/python-psycopg2-timeout? – serv-inc May 27 '19 at 09:27
Yes, but I have no idea why that would do anything to help. – haroba May 27 '19 at 09:38
It mentioned "up to 5 minutes" which approximately matches your reported 250 seconds. Both fail with an `OperationalError`. – serv-inc May 27 '19 at 10:06
For a client design to have fully long running connections, would be be appropriate to set all timeouts to 0? – CMCDragonkai Apr 18 '20 at 06:13

Postgres closes connection during query after a few hundred seconds when using Psycopg2

3 Answers3

Linked

Related