0
  • We previously use to store User Sessions in our DB table (Postgres RDS)

  • We decided to migrate User Session from DB to Redis and made changes in our application

  • For Redis, we decided to use Elastic Cache service with 1 shard, 2 nodes (primary + replica) and Multi AZ enabled

  • On the Live environment, things were pretty smooth till a point where number of session crossed 0.5 million (around 3 PM)

  • At this juncture, Redis Node suddenly stopped responding resulting in complete crash of our Production environment (too many threads waiting for connection)

  • We had to reboot our instance to resume the service

  • This happened again later in the evening around 9 PM

The exception generated at Java end (spring)

2016/11/22 09:19:31.749 <a href="http-nio-8080-exec-780">http-nio-8080-exec-780</a> <a href="https://forums.aws.amazon.com/">ERROR</a> org.apache.catalina.core.ContainerBase.<a href="https://forums.aws.amazon.com/">Tomcat</a>.<a href="https://forums.aws.amazon.com/">localhost</a>.[/].<a href="https://forums.aws.amazon.com/">dispatcherServlet</a> - Servlet.service() for servlet <a href="https://forums.aws.amazon.com/">dispatcherServlet] in context with path [</a> threw exception

org.springframework.data.redis.RedisConnectionFailureException: Cannot get Jedis connection; nested exception is redis.clients.jedis.exceptions.JedisConnectionException: Could not get a resource from the pool

at org.springframework.data.redis.connection.jedis.JedisConnectionFactory.fetchJedisConnector(JedisConnectionFactory.java:140) ~<strike>spring-data-redis-1.4.2.RELEASE.jar!/:1.4.2.RELEASE</strike>

at org.springframework.data.redis.connection.jedis.JedisConnectionFactory.getConnection(JedisConnectionFactory.java:229) ~<strike>spring-data-redis-1.4.2.RELEASE.jar!/:1.4.2.RELEASE</strike>

....

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) <strike>na:1.7.0_72</strike>

at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61) <strike>tomcat-embed-core-8.0.20.jar!/:8.0.20</strike>

at java.lang.Thread.run(Thread.java:745) <strike>na:1.7.0_72</strike>

Caused by: redis.clients.jedis.exceptions.JedisConnectionException: Could not get a resource from the pool

at redis.clients.util.Pool.getResource(Pool.java:42) ~<strike>jedis-2.5.2.jar!/:na</strike>

at redis.clients.jedis.JedisPool.getResource(JedisPool.java:84) ~<strike>jedis-2.5.2.jar!/:na</strike>

at redis.clients.jedis.JedisPool.getResource(JedisPool.java:10) ~<strike>jedis-2.5.2.jar!/:na</strike>

at org.springframework.data.redis.connection.jedis.JedisConnectionFactory.fetchJedisConnector(JedisConnectionFactory.java:133) ~<strike>spring-data-redis-1.4.2.RELEASE.jar!/:1.4.2.RELEASE</strike>

... 55 common frames omitted

Caused by: redis.clients.jedis.exceptions.JedisConnectionException: java.net.SocketTimeoutException: connect timed out

at redis.clients.jedis.Connection.connect(Connection.java:150) ~<strike>jedis-2.5.2.jar!/:na</strike>

at redis.clients.jedis.BinaryClient.connect(BinaryClient.java:71) ~<strike>jedis-2.5.2.jar!/:na</strike>

at redis.clients.jedis.BinaryJedis.connect(BinaryJedis.java:1783) ~<strike>jedis-2.5.2.jar!/:na</strike>

at redis.clients.jedis.JedisFactory.makeObject(JedisFactory.java:65) ~<strike>jedis-2.5.2.jar!/:na</strike>

at org.apache.commons.pool2.impl.GenericObjectPool.create(GenericObjectPool.java:836) ~<strike>commons-pool2-2.2.jar!/:2.2</strike>

at org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:434) ~<strike>commons-pool2-2.2.jar!/:2.2</strike>

at org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:361) ~<strike>commons-pool2-2.2.jar!/:2.2</strike>

at redis.clients.util.Pool.getResource(Pool.java:40) ~<strike>jedis-2.5.2.jar!/:na</strike>

... 58 common frames omitted

Caused by: java.net.SocketTimeoutException: connect timed out

at java.net.PlainSocketImpl.socketConnect(Native Method) ~<strike>na:1.7.0_72</strike>

at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339) ~<strike>na:1.7.0_72</strike>

at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200) ~<strike>na:1.7.0_72</strike>

at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182) ~<strike>na:1.7.0_72</strike>

at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) ~<strike>na:1.7.0_72</strike>

at java.net.Socket.connect(Socket.java:579) ~<strike>na:1.7.0_72</strike>

at redis.clients.jedis.Connection.connect(Connection.java:144) ~<strike>jedis-2.5.2.jar!/:na</strike>

... 65 common frames omitted

We still don't know the root cause of this?

Can someone point us in the right direction and help us in identifying the root cause and solution of this problem?

1 Answers1

0

On some versions of Spring framework, Spring will not close the redis connection after a transaction completes, so connections are eventually exhausted. If you are initializing your Jedis template with setEnableTransactionSupport(true), this can trigger the bug. Setting that to false should fix it.

There are other workarounds if you need transactions. See the section "A transaction pitfall in Spring Data Redis" in this article; http://www.javaworld.com/article/3062899/big-data/lightning-fast-nosql-with-spring-data-redis.html

Ezward
  • 17,327
  • 6
  • 24
  • 32