4

I have a cluster using Infinispan in embedded + replication mode. Cluster size is just 2 systems which work in a active-standby mode.

I have a requirement to support rolling-upgrade of my application. I also have to not lose any cached data in the process.

To illustrate, Consider a simple class User in my application version v1:

public class User {
    private String role;
    private String firstName;
}

In version v2 I add an attribute to the User class:

public class User {
    private String role;
    private String firstName;
    private String lastName;
}

My application gets upgraded as below:

  1. Cluster created with nodes n1(active) and n2(standby) with my application version v1 running on both those nodes. 10 user objects created and that forms a part of my key or value of an infinispan cache c1.
  2. Upgrade initiated on active node n1.
  3. n1 completes the pre-checks and invokes an upgrade on standby node n2.
  4. n2 disconnects itself from the cluster, reboots with the new version v2.
  5. n2 connects back to the cluster and this in-turn triggers the upgrade of node n1.
  6. n1 completes upgrade, comes up with the version v2 and joins the cluster.
  7. upgrade complete.

At step-5, when n2 connects back to the cluster, the User class in version v1 on node n1 is different from User class in version v2 on node n2.

Neither can I upgrade node n1, as that would result in a loss of data, before I sync up with node n2 nor can I expect node n2 to fetch all data and update its caches because of the model changes in c1's key/value.

So how do I handle such a scenario. The User class might also undergo changes such as deletion/change in data type of an attribute or removal of the class itself.

Are there any best practices or well known methods to handle rolling upgrade with infinispan caches?

I need to also handle cache updates on node n1 during the upgrade activity and make sure the node n2's caches(already in version v2) get these updates.

Note: I use 5.3.0 of Infinispan in embedded mode.

2 Answers2

4

Infinispan has rolling upgrade features for upgrading Infinispan version, which includes new cluster spawned and application switch to the new cluster. Then, when some data are requested, new cluster retrieves those data from the old cluster but all writes go to the new cluster only - eventually all data are moved to the new one (I think that there's some option to eagerly push them, too).

However, this does not apply to data upgrades - I haven't found any way to inject your code for the upgrade. Recommended practice for this is to have the User class Externalizable, and just switch to new marshaller in the joining node - the marshaller can handle both formats. However, I am not sure whether this is doable with data already in the cache that don't support Externalization.

Another option is quite complicated: spawn new cluster (single node), and in your application use a wrapper for the cache operations, that would upon:

  • read: get the value from new cluster, if it's null, get that from old one, convert and write to new cluster, and remove from the old cluster
  • write: write the value to new cluster and remove from the old one
  • delete: remove from both clusters

Then, I'd spawn a thread that would read data from old cluster and putIfAbsent to the new. This has a little race: you could putIfAbsent the data to new cluster when it was just removed. Solution for this requires you to keep some tombstones in the new cluster until you migrate all the data. As I've said, it would be quite complicated :)

Radim Vansa
  • 5,686
  • 2
  • 25
  • 40
  • Thank you for your reply. To add to my question: Consider another scenario where the User object is a key in one of the caches c1 in my application version v1. How to handle a scenario when, for some reason, i decide to change the the key of the cache c1 to a new object "Employee" in version v2. The above change would be difficult to handle with the externalizer approach as externalization can handle changes with respect to a class and not with respect to a cache. I guess I will have to give the second approach a try to solve this problem of mine. – Santosh Ananda Aug 21 '14 at 05:34
  • If some instance of Employee should be equal to another instance of User, you have to configure different data container Equivalence class that will treat those types correctly. Probably first reconfiguring one node, then reconfiguring second node and after that starting to use Employee. – Radim Vansa Aug 21 '14 at 09:22
0

I had faced a similar situation. This is how I solved it:

  1. Write the contents of the current cache into some sort of persistence before upgrade

  2. Create a new Cache with a different cluster settings, so that it does not replicate with the existing cache.

  3. This new cache will be used in the upgraded nodes.
  4. As soon as the upgraded nodes comes up, they will read the persistence and update the cache based on the new data model.
  5. The old cache will eventually die out after both the nodes are upgraded.
NiranjanBhat
  • 1,812
  • 13
  • 17