3

I am looking for some ideas on achieving a consistency of EACH_QUORUM on a 3 datacenter scenario, with a tolerance for 1 data center loss for writes.

Based on my understanding, EACH_QUORUM does not tolerate a data center loss. All my writes will fail as long as the data center is down. One option, is to retry the request with lower consistency level like 'QUORUM'.

Option I am looking for is, if there is a way to remove the failed DC and achieve EACH_QUORUM on remaining 2 live datacenters. From server side, only way I can think of is decommission all the nodes of failed datacenter, which is cumbersome.

Assuming that my client application knows that a datacenter is down, is there a way from Cassandra client driver to pass in a list of live datacenters, so the coordinator node tries to achieve EACH_QUORUM only on those datacenters.

Would appreciate any insight on this.

Bharat
  • 100
  • 7
  • Possible duplicate of [Configuring Apache Cassandra for Disaster Recovery](http://stackoverflow.com/questions/13647921/configuring-apache-cassandra-for-disaster-recovery) – Raedwald Sep 12 '16 at 07:05

2 Answers2

1

Your idea of a retry with local_quorum can be done by implementing a RetryPolicy and apply it to the Session. Since the driver knows that it won't be able to achieve to achieve each_quorum, it won't actually try it, and you can handle the error in the retry policy instead of letting bubble up to your application code. I have tried this approach successfully.

Another idea I was considering was a separate Session for each DC (configured as the "local DC"), and execute asynchronously against each. Then depending on the error and what I can gather from the cluster state, log a warning or throw an error.

Andrew Baker
  • 56
  • 1
  • 3
0

Why using EACH_QUORUM if you're looking for an ability to survive DC failure? Use LOCAL_QUORUM then - it was designed for exactly that.

deniszh
  • 774
  • 1
  • 5
  • 14