0

TLDR:

Lots of TCP connections in OPEN_WAIT status shutting down server

Setup:

riak_1.2.0-1_amd64.deb installed on Ubuntu12 Spring MVC 3.2.5 riak-client-1.1.0.jar Tomcat7.0.51 hosted on Windows Server 2008 R2 JRE6_45

Full Description:

How do I ensure that the Java RiakClient is properly cleaning up it's connections to that I'm not left with an abundance of CLOSE_WAIT tcp connections?

I have a Spring MVC application which uses the Riak java client to connect to the remote instance/cluster.

We are seeing a lot of TCP Connections on the server hosting the Spring MVC application, which continue to build up until the server can no longer connect to anything because there are no ports available.

Restarting the Riak cluster does not clean the connections up.

Restarting the webapp does clean up the extra connections.

We are using the HTTPClientAdapter and REST api.

When connecting to a relational database, I would normally clean up connections by either explicitly calling close on the connection, or by registering the datasource with a pool and transaction manager and then Annotating my Services with @Transactional.

But since using the HTTPClientAdapter, I would have expected this to be more like an HttpClient. With an HttpClient, I would consume the Response entity, with EntityUtils.consume(...), to ensure that the everything is properly cleaned up.

HTTPClientAdapter does have a shutdown method, and I see it being called in the online examples. When I traced the method call through to the actual RiakClient, the method is empty. Also, when I dig through the source code, nowhere in it does it ever close the Stream on the HttpResponse or consume any response entity (as with the standard Apache EntityUtils example).

Here is an example of how the calls are being made.

      private RawClient getRiakClientFromUrl(String riakUrl) {
    return new HTTPClientAdapter(riakUrl);
  }


  public IRiakObject fetchRiakObject(String bucket, String key, boolean useCache) {

      try {
         MethodTimer timer = MethodTimer.start("Fetch Riak Object Operation");
         //logger.debug("Fetching Riak Object {}/{}", bucket, key);
         RiakResponse riakResponse;
         riakResponse = riak.fetch(bucket, key);
         if(!riakResponse.hasValue()) {
            //logger.debug("Object {}/{} not found in riak data store", bucket, key);
            return null;
         }

         IRiakObject[] riakObjects = riakResponse.getRiakObjects();
         if(riakObjects.length > 1) {
            String error = "Got multiple riak objects for " + bucket + "/" + key;
            logger.error(error);
            throw new RuntimeException(error);
         }

         //logger.debug("{}", timer);
         return riakObjects[0];
      }
      catch(Exception e) {
         logger.error("Error fetching " + bucket + "/" + key, e);
         throw new RuntimeException(e);
      }
   }

The only option I can think of, is to create the RiakClient separately from the adapter so I can access the HttpClient and then the ConnectionManager.

I am currently working on switching over to the PBClientAdapter to see if that might help, but for the purposes of this question (and because the rest of the team may not like me switching for whatever reason), let's assume that I must continue to connect over HTTP.

Pytry
  • 6,044
  • 2
  • 37
  • 56
  • Whatever `RiakClient` is, it must have a close or disconnect or release or something operation, that you aren't calling. – user207421 May 08 '15 at 00:54
  • Yes, it does. "HTTPClientAdapter does have a shutdown method, and I see it being called in the online examples. When I traced the method call through to the actual RiakClient, the method is empty. " Link the the java-riak-ckient poroject in github: https://github.com/basho/riak-java-client – Pytry May 08 '15 at 14:09

1 Answers1

0

So it's been almost a year, so I thought I would go ahead and post how I solved this problem.

The solution was to change the client implementation we were using to the HTTPClientAdapter provided by the java client, passing in the configuration to implement pools and max connections. Here's some code example of how to do it.

First, we are on an older version of RIAK, so here's the amven dependency:

<dependency>
    <groupId>com.basho.riak</groupId>
    <artifactId>riak-client</artifactId>
    <version>1.1.4</version>
</dependency>

And here's the example:

public RawClient riakClient(){

    RiakConfig config = new RiakConfig(riakUrl);
    //httpConnectionsTimetolive is in seconds, but timeout is in milliseconds
    config.setTimeout(30000);
    config.setUrl("http://myriakurl/);
    config.setMaxConnections(100);//Or whatever value you need

    RiakClient client = new RiakClient(riakConfig);

    return new HTTPClientAdapter(client);
}

I actually broke that up a bit in my implementation and used Spring to inject values; I just wanted to show a simplified example for it.

By setting the timeout to something less than the standard five minutes, the system will not hang to the connections for too long (so, 5 minutes + whatever you set the timeout to) which causes the connectiosn to enter the close_wait status sooner.

And of course setting the max connections in the pool prevents the application from opening up 10's of thousands of connections.

Pytry
  • 6,044
  • 2
  • 37
  • 56