4

I have a Redis Cluster (3 master and 3 slaves) running inside a Kubernetes cluster. The cluster is exposed via a Kubenetes-Service (Kube-Service).

I have my application server connected to the Redis Cluster (using the Kube-Service as the URI) via the Lettuce java client for Redis. I also have the following client options set on the Lettuce connection object:

ClusterTopologyRefreshOptions topologyRefreshOptions = ClusterTopologyRefreshOptions.builder()
              .enablePeriodicRefresh(Duration.ofMinutes(10))
              .enableAllAdaptiveRefreshTriggers()
              .build();

ClusterClientOptions clusterClientOptions = ClusterClientOptions.builder()
              .topologyRefreshOptions(topologyRefreshOptions)
              .autoReconnect(true)
              .disconnectedBehavior(ClientOptions.DisconnectedBehavior.REJECT_COMMANDS)
              .build();
redisClient.setOptions(clusterClientOptions);

Now when I test this setup by killing one of my Redis master's (pods), Kubernetes does its job by re-scheduling a new pod. But the new pod has a new IP address and it is never discovered by Lettuce. How does Lettuce handle re-discovery. It seems like the logic above for topology refresh does not do a DNS lookup again for the new IPs.

Is there any samples out there or anyone who have handled this. I have read multiple Github issues on Lettuce itself that doesn't give a clear way as to how it was handled.

Best

Shabirmean
  • 2,341
  • 4
  • 21
  • 34
  • Did you give a shot on doing some actions on the keys owned by that Redis nodes? You need to perform some option to see really it's able to rediscover a new node or not. – sonus21 Aug 05 '20 at 11:06
  • @sonus21 - I haven't actually tried it. But I can do that. However, is there a deterministic way to generate a key each across different hash-slots of owned by the cluster nodes? – Shabirmean Aug 05 '20 at 13:27
  • @sonus21 Thank you for pointing it out. Indeed on the next WRITE/READ attempt to a HashSlot on this node, the client succeeds in getting the new IP – Shabirmean Aug 05 '20 at 21:04
  • 1
    AFAIK, you can enable autodiscovery as well, for the 2nd question, there's no deterministic way to generate node id from the key, generally, you should have the Redis cluster information in hand before you can know the actual node. Also, the client should implement the same algorithm used by Redis, libraries like Lettuce and others have implemented such things internally you can browse throw the code to see how it resolve node from the key. 2nd point is you have enabled periodic refresh with 10 minutes interval so it should get autodiscovered in 10 minutes post-shutdown. – sonus21 Aug 06 '20 at 04:49

1 Answers1

1

Courtesy of the first comment on the question above.

So I was able to resolve this as follows.

  • The above setup for the client with the given options is good. However, I had to set the disconnectedBehavior to ACCEPT_COMMANDS. This ensured that the client continues to engage with Redis for operations during the fail-over.
  • As a result of this continuous accepting of operations, for the first READ or WRITE that arrives at the client after the failover had successfully elected a new master, the clister will correctly return the new IP address of the new node. From henceforth the client knows whats the new IP for the slots held by the failed node.

This is a lazy approach to reconcile on the next attempt to READ or WRITE. But it works and I believe it's good enough. I am not sure if there are better ways to handle this.

Shabirmean
  • 2,341
  • 4
  • 21
  • 34