-1

I have an app built using Fastapi & SQLAlchemy for handling all the DB-related stuff.

When the APIs are triggered via the frontend, I see that the connections are opened & they remain in IDLE state for a while. Is it possible to reduce the IDLE time via sqlalchemy?

I do the following to connect to the Postgres DB:

import sqlalchemy as db

eng = db.create_engine(<SQLALCHEMY_DATABASE_URI>)
conn = eng.connect()
metadata = db.MetaData()
table = db.Table(
            <table_name>,
            metadata,
            autoload=True,
            autoload_with=eng)
user_id = 1
try:
    if ids_by_user is None:
        query = db.select([table.columns.created_at]).where(
                table.columns.user_id == user_id,
            ).order_by(
                table.columns.created_at.desc()
            )
        result = conn.execute(query).fetchmany(1)
        
        time = result[0][0]
        time_filtering_query = db.select([table]).where(
            table.columns.created_at == time
        )
        time_result = conn.execute(time_filtering_query).fetchall()
        conn.close()
        return time_result
    else:
        output_by_id = []
        for i in ids_by_user:
            query = db.select([table]).where(
                db.and_(
                    table.columns.id == i,
                    table.columns.user_id == user_id
                )
            )
            result = conn.execute(query).fetchall()
            output_by_id.append(result)

        output_by_id = [output_by_id[j][0]
                        for j in range(len(output_by_id))
                        if output_by_id[j]]
        conn.close()
        return output_by_id
finally:
    eng.dispose()

Even after logging out of the app, the connections are still active & in idle state for a while and don't close immediately.

Edit 1

I tried using NullPool & the connections are still idle & in ROLLBACK, which is the same as when didn't use NullPool

enter image description here

halfer
  • 19,824
  • 17
  • 99
  • 186
some_programmer
  • 3,268
  • 4
  • 24
  • 59
  • Does this answer your question? [How to set connection timeout in SQLAlchemy](https://stackoverflow.com/questions/35640726/how-to-set-connection-timeout-in-sqlalchemy) – Mike Organek Jun 11 '22 at 17:00
  • @MikeOrganek If I am not wrong, `connect_timeout` means the `Maximum time to wait while connecting`, which is not similar to closing the connection when idle, right? Please feel free to correct me if I am wrong – some_programmer Jun 11 '22 at 17:08
  • You are right, and I am wrong. My apologies. – Mike Organek Jun 11 '22 at 17:18
  • If I understand you correctly you may give ‘NullPool’ a try. That idling mechanism may be results of sqlalchemy connection poolling. – bc291 Jun 11 '22 at 17:29
  • @bc291 I tried that too, but the connections are still idle – some_programmer Jun 11 '22 at 17:58
  • 1
    It looks wierd that you have `idle` connections when use `NullPool`. Could you provide your sqlalchemy engine setup? Are you sure that `connection.close()` is executed? Maybe rollback causes execution some different part of code and the connection is not closed. – jorzel Jun 11 '22 at 17:58
  • @jorzel added a code in the question – some_programmer Jun 11 '22 at 19:01

1 Answers1

0

You can reduce connection idle time by setting a maximum lifetime per connection by using pool_recycle. Note that connections already checked out will not be terminated until they are no longer in use.

If you are interested in reducing both the idle time and keeping the overall number of unused connections low, you can set a lower pool_size and then set max_overflow to allow for more connections to be allocated when the application is under heavier load.

from sqlalchemy import create_engine
e = create_engine(<SQLALCHEMY_DATABASE_URI>, 
  pool_recycle=3600 # idle connections will be terminated after 1 hour
  pool_size=5 #pool size under normal conditions
  max_overflow=5 #additional connections when pool size is exeeded
)

Google cloud has a helpful guide for optimizing Postgres connection pooling that you might find useful

THX1138
  • 1,518
  • 14
  • 28
  • It is interesting idea. However, I think that `pool_recycle` do not terminate connection after X seconds, but only refresh it. You probably would still have `idle` connection but you can be sure that it is not dead. – jorzel Jun 11 '22 at 18:40
  • 1
    @jorzel it appears that the implementation will terminate a stale connection upon checkout, so if the connection is too old it will be replaced at next checkout -so any request will never use a stale connection. If you keep pool_recycle to a low value it will reduce overall idle time of connections as long as your app is getting regular requests. Which it should be if it has a regular healthcheck. every few minutes/seconds. It's not perfect, but it should still significantly cut overall idle time. – THX1138 Jun 11 '22 at 19:59
  • @jorzel If this app will be under development for a while or not mission critical, you can also try the beta version of sqlalchemy2 which uses the new psycopg3 dialect. psycopg3 has a [complementary pool library](https://www.psycopg.org/psycopg3/docs/api/pool.html#module-psycopg_pool) that uses worker threads to maintain proper pool state and purge stale connections. Then you could let the engine [delegate to the pool to fetch connections](https://docs.sqlalchemy.org/en/20/core/engines.html#fully-replacing-the-dbapi-connect-function) – THX1138 Jun 11 '22 at 20:28
  • @THX1138 What do you mean when you mention the term 'checkout'? – some_programmer Jun 12 '22 at 11:22
  • @Junkrat when a new connection is retrieved from the pool. – THX1138 Jun 12 '22 at 15:26
  • @some_programmer "checkout" refers to the process of acquiring a connection from the pool for use by a client application. When a client requests a connection from the pool, the connection pool manager checks if there are available connections in the pool. If a connection is available, it is "checked out" to the client, meaning it is assigned to the client for its exclusive use during that session or transaction. – Anshul Tiwari Jul 05 '23 at 12:42
  • The `pool_recycle` option does not work for connections that are already checked out. To recycle those connections, you can use a custom function to check for invalid or stale connections. You can pass this function to the [pool_pre_ping](https://docs.sqlalchemy.org/en/20/core/pooling.html#dealing-with-disconnects) parameter when creating the engine. This will recycle any sessions created after the stale connection is detected. – Anshul Tiwari Jul 05 '23 at 12:50