9

I am running a web application in Production that recently crashed after it was under some stress. I would guess 100-300 people were accessing the site at similar times, which I would expect to work fine.

The logs around the time of the crash are:

org.codehaus.groovy.grails.web.taglib.exceptions.GrailsTagException: Error executing tag <g:render>: Hibernate operation: could not inspect JDBC autocommit mode; uncategorize\
d SQLException for SQL [???]; SQL state [null]; error code [0]; Cannot get a connection, pool error Timeout waiting for idle object; nested exception is org.apache.commons.db\
cp.SQLNestedException: Cannot get a connection, pool error Timeout waiting for idle object at /WEB-INF/grails-app/views/layouts/file.gsp:37
        at gsp_file_gsp$_run_closure2.doCall(file_gsp.groovy:43)
        at gsp_file_gsp$_run_closure2.doCall(file_gsp.groovy)
        at gsp_file_gsp.run(gsp_file_gsp.groovy:48)
        at org.apache.jk.server.JkCoyoteHandler.invoke(JkCoyoteHandler.java:190)
        at org.apache.jk.common.HandlerRequest.invoke(HandlerRequest.java:291)
        at org.apache.jk.common.ChannelSocket.invoke(ChannelSocket.java:774)
        at org.apache.jk.common.ChannelSocket.processConnection(ChannelSocket.java:703)
        at org.apache.jk.common.ChannelSocket$SocketConnection.runIt(ChannelSocket.java:896)
        at java.lang.Thread.run(Thread.java:662)
Caused by: org.springframework.jdbc.UncategorizedSQLException: Hibernate operation: could not inspect JDBC autocommit mode; uncategorized SQLException for SQL [???]; SQL stat\
e [null]; error code [0]; Cannot get a connection, pool error Timeout waiting for idle object; nested exception is org.apache.commons.dbcp.SQLNestedException: Cannot get a co\
nnection, pool error Timeout waiting for idle object
        at User.find(User.groovy:68)
        at User$find.call(Unknown Source)
        at gsp_pps_file_gsp.run(gsp_file_gsp.groovy:22)
        ... 9 more
Caused by: org.apache.commons.dbcp.SQLNestedException: Cannot get a connection, pool error Timeout waiting for idle object
        at org.apache.commons.dbcp.PoolingDataSource.getConnection(PoolingDataSource.java:114)
        at org.apache.commons.dbcp.BasicDataSource.getConnection(BasicDataSource.java:1044)
        at $Proxy7.getAutoCommit(Unknown Source)
        ... 12 more
Caused by: java.util.NoSuchElementException: Timeout waiting for idle object
        at org.apache.commons.pool.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:1144)
        at org.apache.commons.dbcp.PoolingDataSource.getConnection(PoolingDataSource.java:106)
        ... 14 more

I believe this is directly related to some code I added recently to my Grails project that affect the Connection Pool (although I believe this is not a grails specific problem):

maxActive = 50
maxIdle = 15
minIdle = 5
initialSize = 15
minEvictableIdleTimeMillis = 180000
timeBetweenEvictionRunsMillis = 180000
maxWait = 10000
validationQuery = "/* ping */"

What am I doing wrong? Please help! Thank you.

skaz
  • 21,962
  • 20
  • 69
  • 98
  • Is my maxWait too low? I am reading the documentation and this could have been the problem, but I can't say I am confident in it. – skaz Sep 01 '11 at 23:30
  • 1
    Also, if I do `show processlist;` in MySQL I only get 8 connections from my Apache server. Does that mean that this code isn't working at all? – skaz Sep 02 '11 at 00:13
  • If I remember right your initial problem was that you had a bunch of dead connections in your pool because MySQL timed out. Did you try just setting your validation string and leaving everything else as it was? – chrislatimer Sep 02 '11 at 15:37
  • Did you get a fix for this issue? Can you please share the resolution? – Java Guy May 04 '12 at 01:03
  • 1
    Check if you have any connection leaks. Review your code http://stackoverflow.com/questions/10440391/dbcp-connection-properties/10471288#10471288 – Java Guy May 07 '12 at 22:36
  • Do you happen to have the tuned server max thread count versus the database max count after it was working? – ledlogic Jan 17 '15 at 17:51

2 Answers2

3

Change out the relevant parts to:

minEvictableIdleTimeMillis=1800000
timeBetweenEvictionRunsMillis=1800000
numTestsPerEvictionRun=3
testOnBorrow=true
testWhileIdle=true
testOnReturn=true
validationQuery="SELECT 1"

This should probably solve this issue.

Oliver Tynes
  • 954
  • 4
  • 11
-4

I had bad code that was keeping connections alive way longer than they should be. Once I fixed the underlying problem this went away.

skaz
  • 21,962
  • 20
  • 69
  • 98
  • 3
    How did you track down the offending code? I am seeing the same problem that only exists in production. We recently added a new feature and when I turn that off the issue disappears. That new code doesn't directly manage the DB connections though so I'm not sure how it could be leaving connections open. The only thing I can say that is unique is the new code is executed in a DB trigger method....afterInsert() in a few of my Domain classes. – ShatyUT Feb 10 '13 at 03:53
  • 1
    @ShatyUT you could always track your connections using a profiler like VisualVM (with proper plugins) - so you could see in runtime your connections being opened and fail to close on specific actions. GL – Yaniv Apr 01 '14 at 08:03