0

My application is throwing "Connection pool getting exhausted error" after running for couple of hours. I'm using psycopg2 multithreaded connection pool to connect my Dash application to PostgresSQL database. My dash application refreshes itself at 30 seconds interval

Dash app setup:

import dash
from dash.dependencies import Input, Output, State
from dash_core_components.Loading import Loading
import dash_html_components as html
import dash_core_components as dcc
import dash_bootstrap_components as dbc
import get_data  # File which retrieves the data from database

application = Flask(__name__)
app = dash.Dash(
    __name__,
    server=application,
    suppress_callback_exceptions=True,
    external_stylesheets=[dbc.themes.DARKLY],
)

max_date, min_date = get_data.get_max_min_date()
max_datetime = 0

alerttoast = dbc.Toast(
    "New Data Incoming!!",
    id="alert-toast",
    header="New Data",
    icon="info",
    dismissable=True,
    duration=5000,
    style={"position": "fixed", "top": 66, "right": 10, "width": 350},
)

sorted_region_list = [
    "Western Region",
    "Mid-Atlantic Region",
    "Southern Region"
]

app.layout = html.Div(
    [
        html.Div(
            [
                dcc.Dropdown(
                    id="region",
                    options=[{"label": i, "value": i} for i in sorted_region_list],
                    value=["Western Region",],
                    optionHeight=25,
                    multi=True,
                    searchable=True,
                    clearable=True,
                    persistence_type="session",
                    placeholder="Please select only one Region",
                ),
                html.Hr(),
                dcc.DatePickerRange(
                    id="my-date-picker-range",
                    min_date_allowed=min_date,
                    max_date_allowed=max_date,
                    initial_visible_month=max_date,
                    month_format="YYYY-MM-DD",
                    start_date=max_date,
                    end_date=max_date,
                    end_date_placeholder_text="YYYY-MM-DD",
                    clearable=True,
                ),
                html.Hr(),
                html.Button(
                    id="submit-button",
                    n_clicks=0,
                    children="Submit",
                ),
            ],
            style={
                "width": "40%",
                "display": "inline-block",
                "margin-left": "30px",
                "color": "black",
            },
        ),
        dcc.Loading(
            children=[
                html.Div(
                    [
                        dcc.Graph(id="load"),
                    ],
                    style={"padding": "1rem 0rem 1rem 0rem", "margin-top": "50px"},
                )
            ]
        ),
        html.Div(id="the-toast", children=[]),
        dcc.Interval(
            id="interval-component",
            interval=30 * 1000,
            n_intervals=0,  # in milliseconds
        ),
    ],
    style={
        "padding": "4rem 1rem 2rem 1rem",
    },
)

@app.callback(
    [Output("load", "figure"), Output("the-toast", "children")],
    [Input("submit-button", "n_clicks"), Input("interval-component", "n_intervals")],
    [
        State("region", "value"),
        State("my-date-picker-range", "start_date"),
        State("my-date-picker-range", "end_date"),
    ],
)
def update_graph(
    n_clicks,
    n_intervals,
    region_name,
    start_date,
    end_date,
):

    df = get_data.get_data(start_date, end_date, region_name)
    fig = go.Figure()
    fig.add_trace(go.Scatter(
        x = df['x_axis_column']
        y = df['y_axis_column'])
    if max_datetime == 0:
        max_datetime = max(new_data)
        return fig, alerttoast
    if max_datetime == max(new_data):
        return fig, dash.no_update
    else:
        max_datetime = max(new_data)
        return fig, alerttoast

get_data.py file

Making a Connection

import config
import psycopg2
from psycopg2 import pool
from psycopg2 import sql
import pandas as pd

try:

    connection_pool = psycopg2.pool.SimpleConnectionPool(
        40,
        100,
        host=os.host,
        port=os.port,
        database=os.database,
        user=os.user,
        password=os.dbpassword,
    )
    # cur = conn.cursor()
    print("Connected to the Database!!")
except Exception as e:
    print("Database connection failed due to {}".format(e))


def get_data(start_date, end_date, region_name):
    conn = connection_pool.getconn()
    with conn.cursor() as cur:
         query = """query"""
     
         cur.execute(
                query,
                (
                    input1,
                    input2,
                    input2,
                ),
            )
         df= pd.DataFrame.from_records(
                cur.fetchall(), columns=[desc[0] for desc in 
    cur.description]
            )
     connection_pool.putconn(conn)

# For loop queries run only if user selects a date range
    if start_date == end_date:
       conn = connection_pool.getconn()
       with conn.cursor() as cur:
            cur.execute(
                    """query""",
                    (
                        input1,
                        input2,
                        input3,
                        input4,
                    ),
                )
            df= pd.DataFrame.from_records(
                    cur.fetchall(), columns=[desc[0] for desc in cur.description]
                )
           
        connection_pool.putconn(conn)

    else:
        dates = pd.date_range(start_date, end_date, freq="d").date
        mydf = []
        conn = connection_pool.getconn()
        for date in dates:
            date = str(date)
            with conn.cursor() as cur:
                query = """query"""
                    cur.execute(
                        query,
                        (
                            input1,
                            input2,
                            input3,
                            input4,
                        ),
                    )
                df= pd.DataFrame.from_records(
                        cur.fetchall(), columns=[desc[0] for desc in cur.description]
                    mydf.append(df)
            connection_pool.putconn(conn)
            
            df= pd.concat(mydf, axis=0, ignore_index=False)

Error Message from the logs:

web: Error from get data block connection pool exhausted

Error while implementing the logging connections for simple connection pool

In get_data.py file i used the logic explained here link How do I use Psycopg2's LoggingConnection?

import config
import psycopg2
import logging
from psycopg2 import pool
from psycopg2 import sql
from psycopg2.extras import LoggingConnection

logging.basicConfig(level=logging.DEBUG)
logger = logging.getLogger(__name__)
try:

    connection_pool = psycopg2.pool.SimpleConnectionPool(
        50,
        100,
        # dsn=DSN
        host=config.host,
        port=config.port,
        database=config.database,
        user=config.user,
        password=config.dbpassword,
        connection_factory=LoggingConnection,
    )
    connection_pool.initialize(logger)
    print("Connected to the Database!!")
except Exception as e:
    print("Database connection failed due to {}".format(e))
I got error: Database connection failed due to 'SimpleConnectionPool' object has no attribute 'initialize'

I have checked the code I'm putting all the connections back to the pool. I have checked other posts as well, but couldn't find a solution. I don't know what I'm doing wrong here, any help would be appreciated!

  • Try from [Logging](https://www.postgresql.org/docs/current/runtime-config-logging.html#RUNTIME-CONFIG-LOGGING-WHAT) enabling `log_connections`, `log_disconnections` and then tailing the log to see what is actually happening. – Adrian Klaver Jun 12 '21 at 14:39
  • You don't show us any use of multithreading, nor any use of plotly-dash (that I can detect, maybe I just don't know it when I see it). Is the fragment of code you show sufficient to reproduce the problem? – jjanes Jun 12 '21 at 16:02
  • @jjanes ,Sorry for that, I have edited the code, for testing purpose i've reverted to SimpleConnectionPool instead of ThreadedConnectionPool. Please have a look at the same. – Yadvendar Singh Jun 13 '21 at 19:33
  • @AdrianKlaver I was also trying to log the connections. But I was getting the error mentioned at the end of the updated question. I was following this link [link](https://stackoverflow.com/questions/28950212/how-do-i-use-psycopg2s-loggingconnection). But i was getting the error "Database connection failed due to 'SimpleConnectionPool' object has no attribute 'initialize'". Please have a look at the same, maybe I'm just too naiive for this stuff. Thank you in anticipation. – Yadvendar Singh Jun 13 '21 at 19:37
  • I was talking about looking at the Postgresql log directly via `tail` or a text editor. Per docs here [Pool](https://www.psycopg.org/docs/pool.html) the error is correct in that `pool.*Pool` has no `initialize` attribute. I'm guessing you are never actually making an initial connection. For instance I not sure how `os.host, os.port`, etc resolve to anything? – Adrian Klaver Jun 13 '21 at 20:33
  • @AdrianKlaver Thanks Adrian, I increased the number of connections and tested it for two days, The app is working fine. The initial connections were being made because the charts were loading correctly and the data was being pulled from the database. That means that os.host, os.port , etc were being used to make the connection. The `initialize` error was coming because there is no such attribute available for the `SimplePoolConnection` in psycopg2 documentation [link](https://www.psycopg.org/docs/pool.html) – Yadvendar Singh Jun 16 '21 at 14:53

0 Answers0