0

I am currently using a micro services architecture and under high load (around 1000 requests per second), the services become drastically slow and most of the components that use database keep restarting due to the error

javax.persistence.PersistenceException: org.hibernate.exception.GenericJDBCException: Unable to acquire JDBC Connection

Hikari Logs

DEBUG HikariPool:411 - HikariPool-3 - Pool stats (total=20, active=20, idle=0, waiting=234)
DEBUG HikariPool:411 - HikariPool-3 - Pool stats (total=20, active=20, idle=0, waiting=240)

Symptoms

  • Database CPU maxes out
  • Slowness of services
  • Service downtime due to restarts
  • Number of waiting on hikari pool increases

Technologies

  • Kubernetes
  • Spring cloud
  • Hikari for spring boot jdbc pool
  • Hibernate
  • Postgres 13
  • Pgpool to manage postgres database connection pooling

System Configurations

  • Database server has 1 master and 2 read replicas (Database server is SSD and has 40Cores dedicated on each node). Server specification -> PowerEdge R740 with 192GB RAM Processor 2*(Intel(R) Xeon(R) Gold 6154 CPU @ 3.00GHz Core Enabled:18 Thread Count: 36)
  • 3 pgpool instances: manages db connection for all instances of the micro services.
  • 7 microservices. Each can scale to a maximum of 10 pods.
  • 7 databases. Average size of 5gb. Each micro service has a separate database.

Async configurations for each micro service (manages threads spawning for spring-boot rest endpoints)

    @Override
    @Bean(name="taskExecutor")
    public AsyncTaskExecutor getAsyncExecutor() {
        ThreadPoolTaskExecutor threadPoolTaskExecutor = new ThreadPoolTaskExecutor();
        threadPoolTaskExecutor.setThreadNamePrefix("Async-");
        threadPoolTaskExecutor.setCorePoolSize(3);
        threadPoolTaskExecutor.setMaxPoolSize(3);
        threadPoolTaskExecutor.setQueueCapacity(600);
        threadPoolTaskExecutor.afterPropertiesSet();
        return threadPoolTaskExecutor;
    }

Hikari Configurations for each micro service

spring.datasource.hikari.maxLifetime : 18000
spring.datasource.hikari.maxPoolSize : 20
spring.datasource.hikari.idleTimeut : 600
spring.datasource.hikari.minimumIdle : 20 
spring.datasource.hikari.connectionTimeout : 30000

Pg-pool configurations

PGPOOL_NUM_INIT_CHILDREN: 128
PGPOOL_MAX_POOL: 20
PGPOOL_CHILD_LIFE_TIME:300
PGPOOL_CLIENT_IDLE_LIMIT:600
PGPOOL_CONNECTION_LIFE_TIME:600
PGPOOL_CHILD_MAX_CONNECTIONS:2
PGPOOL_SR_CHECK_PERIOD: 21600

Query Plan

Query Plan

Planning Time: 0.223 ms
Execution Time: 13.816 ms

Attached is a screenshot of the new relic image and system architecture

New relic image

Architecture

What I have tried

  • Increased database CPU from 30 cores to 40 cores (however, no matter which value I increase it to, the CPU usage maxes out under high load)

  • I have tried to update hikari configurations to the default values, but still the issue persists.

spring.datasource.hikari.maxLifetime : 1800000
spring.datasource.hikari.maxPoolSize : 10
spring.datasource.hikari.idleTimeut : 60000
spring.datasource.hikari.minimumIdle : 10 
spring.datasource.hikari.connectionTimeout : 30000
  • I have optimized the queries to only fetch required fields

  • I have added indexes on the database tables to make queries faster

My Expectations

  • Best configurations for hikari and pgpool to eliminate the long time taken to get connection from DB and the "unable to get jdbc exception".
  • Suggestions on best springboot configurations in a distributed environment.
  • Improvement suggestions on scaling of spring boot microservices
Ifesinachi Bryan
  • 2,240
  • 1
  • 19
  • 20

0 Answers0