10

I am building an app using Spring-Boot/Hibernate with Postgres as the database. I am using Spring 2.0, so Hikari is the default connection pool provider.

Currently, I am trying to load-test the application with a REST end-point that does an 'update-if-exists and insert if new' to an entity in the database. Its a fairly small entity with 'BIGSERIAL' primary key and no constraints on any other field.

The default connection pool size is 10 and I haven't really tweaked any other parameters - either of the HikariCP or for Postgres.

The point at which I am stuck at this moment is to debug connections in 'active' state and what they are doing or why they stuck currently.

When I run '10 simultaneous users', it basically translates into 2 or 3 times that many queries and thus, when I turn on the HikariCP debug logs, it hangs at something like this - (total=10, active=10, idle=0, waiting=2) and the 'active' connections do not really release the connections, which is what I am trying to find out because the queries are fairly simple and the table itself is just 4 fields (including the primary key).

The best practices from HikariCP folks as well generally is that increasing the connection pool is not the right first step towards scaling.

If I do increase the connection pool size to 20, things start working for 10 simultaneous/concurrent users but then again, its not the root cause/solution for the problem I believe.

Is there any way I can log either Hibernate or Postgres messages that might help in knowing what these 'active' connections are waiting on and why the connection doesn't get released even after I increase the wait-time to a long time?

If it is a connection-leak ( as is reported when the leak-detection-threshold is reduced to a lower value (e.g. 30 seconds) ), then how can I tell if Hibernate is responsible for this connection leak or if it is something else?

If it is a lock/wait at the database level, how can I get the root of this?

UPDATE After help from @brettw, I took a thread-dump when the connections were exhausted and it pointed in the direction of a connection-leak. The threads on HikariCP issues board - https://github.com/brettwooldridge/HikariCP/issues/1030#issuecomment-347632771 - which points to the Hibernate not closing connections which then pointed me to https://jira.spring.io/browse/SPR-14548, which talks about setting Hibernate's connection closing mode since the default mode holds the connection for too long. After setting spring.jpa.properties.hibernate.connection.handling_mode=DELAYED_ACQUISITION_AND_RELEASE_AFTER_TRANSACTION, the connection pool worked perfectly.

Also, the point made here - https://github.com/brettwooldridge/HikariCP/issues/612#issuecomment-209839908 is right - a connection leak should not be covered up by the pool.

FrailWords
  • 886
  • 7
  • 19
  • If in your code you create connection and don't close it, You won't be able to find it except for searching your code or use tools as sonarlint to help you find it – Ori Marko May 30 '18 at 14:37
  • @user7294900 l also suspect connection not closing being the problem from somewhere. In this case, the connection management is done by hibernate and so, I am not really doing anything explicitly in the code for connection management. I'll check on sonarlint, haven't heard of that before. – FrailWords May 30 '18 at 14:40

1 Answers1

8

It sounds like you could be hitting a true deadlock in the database. There should be a way to query PostgreSQL for current active queries, and current lock states. You'll have to google it.

Also, I would try a simple thread dump to see where all the threads are blocked. It could be a code-level synchronization deadlock.

  • If all of the threads are blocked on getConnection(), it is a leak.
  • If all of the threads are down in the driver, according to the stacktrace for each thread, it is a database deadlock.
  • If all of the threads are blocked waiting for a lock in your application code, then you have a synchronization deadlock -- likely two locks with inverted acquisition order in different parts of the code.

The HikariCP leakDetectionThreshold could be useful, but it will only show where the connection was acquired, not where the thread is currently stuck. Still, it could provide a clue.

brettw
  • 10,664
  • 2
  • 42
  • 59
  • 1
    I did a thread dump when the connections get exhausted and all the connections are stuck at `getConnection()` at - `com.zaxxer.hikari.util.ConcurrentBag.borrow`. I read your comment on Hibernate not closing the connection at this thread - https://github.com/brettwooldridge/HikariCP/issues/1030#issuecomment-347632771 and once I set the option according to - https://jira.spring.io/browse/SPR-14548, it started releasing connections and connection pool doesn't exhaust. Now the problem is more at the API response level and its easier to reason with. – FrailWords Jun 04 '18 at 09:52