I am doing a lot of ETL with Pandas and Postgres. I have a ton of idle connections, many marked with COMMIT
and ROLLBACK
, that I am not sure how to prevent from sitting as idle for long periods rather than closing. The main code I use to write to the database is using pandas to_sql
:
def write_data_frame(self, data_frame, table_name):
engine = create_engine(self.engine_string)
data_frame.to_sql(name=table_name, con=engine, if_exists='append', index=False)
I know this is definitely not best practice for PostgreSQL and I should be doing something like passing params to a Stored Procedure or Function or something, but this is how we are setup to get data_frames from non-Postgres databases / data sources and upload to Postgres.
My pgAdmin looks like this:
Can someone please point me in the right direction of how to avoid this many idle connections in the future? Some of our database connections are meant to be long-lived as they are continuous "batch" processes. But it seems like some one-off events are leaving connections open and idle.