0

I have developed a plugin for Tomcat that allows session data to be persisted and distributed across a Cassandra ring. I want Cassandra to handle session expiry using TTL settings on the various columns. The problem I have now is that the various objects within a session all expire at different times, so a session loses unused objects over time - even when the session (and other session objects) is continuously accessed.

Is there a way in which I can set the TTL on a super column, and have all data stored under a key in this super column expire when the key expires?

I don't want to traverse all data stored within the web session each time a HTTP response is returned, as this would incur unnecessary I/O between the tomcat plugin and Cassandra. I also don't want to keep any in memory cache in the Tomcat plugin, as I want Tomcat to be completely stateless and maintain all user session state purely in Cassandra.

This Tomcat plugin is pretty nifty, as it allows previously stageful web applications to become stateless - thus allowing horizontal scaling. It would be fantastic to get over this TTL issue...

https://code.google.com/a/apache-extras.org/p/tomcat-cassandra/

1 Answers1

0

I have recently been thinking about this problem, and there is no easy solution using the built-in column ttl without incurring signficant i/o on each request. I would suggest to not totally rely on the column ttl.

  1. When creating the session, create an extra expiration_timestamp column, and set it to a reasonable value, say a timestamp that expires in 60 minutes.

session_key, value1, value2, expiration_timestamp=60minutes

  1. When inserting a column(value1, value2 etc..), set the ttl to 60 minutes. (Columns may be added dynamically during http request, therefore, initially they expire at different times)

  2. When user performs http request, read the session row(optimized through row cache, key cache etc..), but don't update ttl or expiration_timestamp yet.

  3. When user request is close to the expiration_timestamp, say 50 minutes, update the expiration_timestamp(add another 60 minutes), and reinsert the entire session row with a new ttl.(all columns ttl will be updated at the same time and in sync).

The solution ensures we perform reads most of the time(hopefully from cache), and do a write i/o (to keep session alive, update ttl) only before the session is expiring. And "when" to perform the update ttl is arbitrary, you can set it to 30 minutes, 50 minutes, but just not on each access.

user2659443
  • 141
  • 2
  • 4