22

I've got a Spring app that I believe uses DBCP connection pooling to connect to a MySql database. I say believe because this isn't an area I'm very strong in and I'm not positive if everything is set up correctly. I have no problems running the application and everything is working fine. The problem occurs overnight. The app is not heavily used and overnight it apparently loses it's connection to MySql. I looked into it and found out MySql has an 8 hour window and then it disconnects or whatever. I'm fine with this, but when a user attempts to log on in the morning, they get an error something like:

Communications link failure. The last packet successfully received 60,000,000ms ago. The last packet successfully setn 15ms ago.

This is the problem. I need them to be able to reconnect in the morning without running into this issue. The only way I seem to be able to fix it is by bouncing the Tomcat server. From looking into it, it seems that DBCP pooling should be able to prevent this somehow but I can't find a reliable source of info on how to configure it. I'm hoping someone here can provide me with some insight. Here is my current configuration, all done in a Spring xml file:

app-data.xml

<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:aop="http://www.springframework.org/schema/aop"
xmlns:tx="http://www.springframework.org/schema/tx"
xmlns:context="http://www.springframework.org/schema/context"
xsi:schemaLocation="
    http://www.springframework.org/schema/beans 
    http://www.springframework.org/schema/beans/spring-beans-3.0.xsd
    http://www.springframework.org/schema/tx 
    http://www.springframework.org/schema/tx/spring-tx-3.0.xsd
    http://www.springframework.org/schema/aop 
    http://www.springframework.org/schema/aop/spring-aop-3.0.xsd
    http://www.springframework.org/schema/context
    http://www.springframework.org/schema/context/spring-context-3.0.xsd">

<context:annotation-config />
<context:component-scan base-package="com.vz.sts.domain" />
<context:component-scan base-package="com.vz.sts.persistence" />
<context:component-scan base-package="com.vz.sts.service" />

<bean class="org.springframework.orm.jpa.support.PersistenceAnnotationBeanPostProcessor" />

<bean id="entityManagerFactory" class="org.springframework.orm.jpa.LocalContainerEntityManagerFactoryBean">
    <property name="dataSource" ref="dataSource" />
    <property name="jpaVendorAdapter">
        <bean class="org.springframework.orm.jpa.vendor.HibernateJpaVendorAdapter">
            <property name="database" value="MYSQL" />
            <property name="showSql" value="true" />
        </bean>
    </property>
</bean>

<bean id="dataSource" class="org.apache.commons.dbcp.BasicDataSource">
    <property name="driverClassName" value="com.mysql.jdbc.Driver" />
    <property name="url" value="jdbc:mysql://localhost:3306/app" />
    <property name="username" value="root" />
    <property name="password" value="admin" />
    <property name="initialSize" value="5" />
</bean>

<bean id="transactionManager" class="org.springframework.orm.jpa.JpaTransactionManager">
    <property name="entityManagerFactory" ref="entityManagerFactory" />
</bean>

<bean id="jdbcUserService" class="org.springframework.security.provisioning.JdbcUserDetailsManager">
    <property name="dataSource" ref="dataSource"/>
    <property name="authenticationManager" ref="authenticationManager"/>
</bean>

<bean id="saltSource" class="org.springframework.security.authentication.dao.ReflectionSaltSource">
    <property name="userPropertyToUse" value="username" />
</bean>

<tx:annotation-driven />
</beans>

I'm not sure what specific properties I need to add in order to allow the app to reconnect to the database. I don't mind if it closes the connection after a number of hours but it should automatically reconnect and not throw errors like this. Nor am I even positive it's actually set up to use connection pooling. So any help would be very much appreciated, thank you.

UPDATE

I found this page and I think that all I need to do is add the ValidationQuery property. Can anyone verify if this will have the desire affect while leaving everything else at default? I believe that will then make use of the testOnBorrow aspect of DBCP. I don't entirely understand what the explanation says testOnBorrow does, but I think this will do what I want. Anyone confirm? Thanks.

cardician
  • 2,451
  • 3
  • 26
  • 36
  • 1
    One thought would be to have a simple script that hits the service to keep it "warm" every hour or so. If the issue is that resident services are being taken off-line due to inactivity, this would keep that from happening. – cdeszaq Dec 16 '11 at 14:24
  • Thanks but I know DBCP can reconnect without this problem or that there is a setting to prevent this and that's the way I'd prefer to solve it. – cardician Dec 16 '11 at 14:28
  • I agree, which is why I posted that as a comment, not an answer. My suggestion is nothing more than a work-around, but it does eliminate the symptoms for your users until you find a solution to the problem :) – cdeszaq Dec 16 '11 at 14:30

2 Answers2

4

The short answer is it should be enough. DBCP supports testing the connection on borrowing from the connection pool (the default), but also supports test on return and test while idle.

It's also worth understanding what may be going wrong here. It sounds like something between your Tomcat server and the database is dropping the idle connection after a timeout (such as a router or firewall). The problem with this is that Tomcat thinks it still has a valid connection, tries to do some work with the connection and fails, but keeps the connection alive and returns it to the pool. Now any further attempt to talk to the database will fail if it is given the same broken connection from the pool.

I think it was Michael Nygard's excellent 'Release It!' book that described this scenario in one of his from-the-trenches stories.

You will also want to look into how MySQL cleans up dead connections as when Tomcat loses the connection after 8 hours the DB will also be unaware of the failed connection.

One final point, if you are using Tomcat 7 switch to their new connection pool as it offers better performance than DBCP.

SteveD
  • 5,396
  • 24
  • 33
  • Are you saying that I should use one of the other tests? Return or Idle? I know they are defaulted to false so I'll need to turn them on. I'm game to using whatever is the best solution, but as I said I don't fully understand what the explanation on the DBCP page are saying those three methods do. – cardician Dec 16 '11 at 14:57
  • 2
    The mysql server setting `wait_timeout` is by default 8 hours. Idle connections are dropped after 8 hours, a JDBC driver doesn't detect that until you issue a query on that connection, and you get the above stacktrace. (you easily get 8 hours of idle time for apps that's only used during working hours). Often setting the DBCP testWhileIdle, validationQuery and timeBetweenEvictionRunsMillis can help with such issues, as it'll ping the connection and keep it alive. The validation query can just be set to "select 1" – nos Dec 16 '11 at 15:14
  • Ok, thanks very much. And thanks for the Select 1 statement. Much better than what I was using. I will see if these fixes work on the next work day. Very much appreciated. – cardician Dec 16 '11 at 15:22
  • 2
    After an entire weekend away and adding testWhileIdle and timeBetweenEvictionRunsMillis I can verify that this did indeed solve the problem. Thanks very much for your answer and help. – cardician Dec 19 '11 at 15:03
  • @nos I am using Hikari and we do not have testWhileIdle options there?? What do you suggest?? I am using SELECT 1 as connection-test-query.. – sheetal Feb 15 '17 at 16:00
1

My friend, DBCP does a promise he can't keep. Hehe. I've found myself with this problem and it got down to some newly firewall recently put in the middle chopping idle connections with idle time longer than X hours. So, the Db couldn't notify my client (and its socket) that the conn was going down and the socket was kept open, hence the pool couldn't know that the conn was not available. Result: first query attempt in the morning failed with timeout while the second worked as expected. Even with the validationQuery, DBCP didn't check an already valid conn (don't ask me why, I just found out that)

Solution 1? Due to the fact that it was a production environment (yeah, lots of sweat), the fast horse was to create a separate thread sending a sure-thing query to the DB using the pool every... X/4 hours. It kept the brand-new firewall/WAF from chopping my socket conn!

Solution 2? Check infrastructure. Check continuity. Check coherence in speed and mode of network interfaces (e.g full duplex, 100M). Check Db server settings (no net card saving energy hehe). And maybe keeping the probe in solution 1 working.

EDIT. testOnBorrow and validationQuery should work under normal circumstances. Imaging the pool with logical channels and a physical socket btw client and server. testOnBorrow checks if a channel is valid before giving it out to your request. It uses validationQuery to do it.

Alfabravo
  • 7,493
  • 6
  • 46
  • 82
  • Thanks. Since this doesn't occur until after an 8 hour period of inactivity I won't be able to know for sure if ading the validationQuery solved my problem or if there is more to it. I appreciate the info though. – cardician Dec 16 '11 at 14:56
  • Maybe the image I described was not clear enough to emphasize the socket-conn thing. testOnBorrow CAN validate a conn BUT it fails checking a SOCKET. Hence, the object created using the driver returns some odd exception that is not managed properly by the pool, so your validationQuery become useless... and I know how it is. I had to get to the client's for a week, early in the morning to check if the problem was gone hehehe. Anyhow, let us know how it goes! – Alfabravo Dec 16 '11 at 14:58
  • Sorry, none of this is something I'm overly strong at. So you're saying that my error will not be fixed by using ValidationQuery because the problem is actually something else killing the connection and not letting it be re-established? Is there some way to have DBCP force the connection to remain open then? Maybe using timeBetweenEvictionRunsMillis which defaults to off I believe? The only difference I see in our situations is that no queries work once this happens for me. First, second, none. – cardician Dec 16 '11 at 15:01
  • I say that, if the error is related to deep-physical problems (not session level, app level BUT network, link or transportation level), DBCP can't figure it out because tools in drivers don't give enough info (I even decompiled a driver hehehehe). If the conn error is logical, then setting the extra parameters will work like a charm. Let's wait :) --- EDIT: timeBtwEviction relates to interval between events of checking which conn is idle in the pool and, if it's old enough, kill it. – Alfabravo Dec 16 '11 at 15:03