3

The below statement from Cassandra documentation is the reason for my doubt.

For example, if using a write consistency level of QUORUM with a replication factor of 3, Cassandra will replicate the write to all nodes in the cluster and wait for acknowledgement from two nodes. If the write fails on one of the nodes but succeeds on the other, Cassandra reports a failure to replicate the write on that node. However, the replicated write that succeeds on the other node is not automatically rolled back.

Ref : http://docs.datastax.com/en/cassandra/2.1/cassandra/dml/dml_atomicity_c.html

So does Cassandra write to a node(which is up) even if Consistency cannot be met ?

  • See also https://stackoverflow.com/questions/21747586/how-data-will-be-consistent-on-cassandra-cluster?rq=1 – DNA May 27 '15 at 13:48

3 Answers3

4

I got it. Cassandra will not even attempt to write if it knows that consistency cannot be met. If consistency CAN be met, but does not have enough replicas to satisfy replication factor, then Cassandra would write to currently available replicas and gives a success message. Later when the replica is up again, it will write to other replica.

For e.g. If Replication factor is 3 , 1 of 3 nodes are down, then if I write with a Consistency of 2, the write will succeed. But if Replication factor is 2 and 1 of 2 nodes are down , then if I write with a Consistency of 2, Cassandra will not even write to that single node which is available.

What is mentioned in the documentation is a case where while write was initiated when the consistency can be met. But in between, one node went down and couldn't complete the write, whereas write succeeded in other node. Since consistency cannot be met, client would get a failure message. The record which was written to a single node would be removed later during node repair or compaction.

3

Consistency in Cassandra can (is?) be defined at statement level. That means you specify on a particular query, what level of consistency you need.

This will imply that if the consistency level is not met, the statement above has not met consistency requirements.

There is no rollback in Cassandra. What you have in Cassandra is Eventual consistency. That means your statement might be a success in future if not immediately. When a replica node comes a live, the cluster (aka the Cassandra's fault tolerance) will take care of writing to the replica node.

So, if your statement is failed, it might be succeeded in future. This is in contrary to the RDBMS world, where an uncommitted transaction is rolled back as if nothing has happened.

Update: I stand corrected. Thanks Arun.

From:

http://docs.datastax.com/en/cassandra/2.0/cassandra/dml/dml_about_hh_c.html

During a write operation, when hinted handoff is enabled and consistency can be met, the coordinator stores a hint about dead replicas in the local system.hints table under either of these conditions:

So it's still not rollback. Nodes know the current cluster state and doesn't initiate the write if consistency cannot be met.

pinkpanther
  • 4,770
  • 2
  • 38
  • 62
  • I doubt if it will succeed in future, because the client already got the response that the write failed. So client is not expecting to see the data again in future. – Arun Joy Thekkiniyath May 28 '15 at 04:00
  • @ArunJoyThekkiniyath The response you got is client side on that particular statement. For example, if you read the same row with lower consistency level (ex: one), it's a success operation. So, consistency is what you define for your case. That's why it is called **Tunable consistency**. When the dead replica node comes a live, the data will be eventually written hence, Eventual Consistency. Read http://stackoverflow.com/questions/24587869/understand-cassandra-replication-factor-versus-consistency-level and http://stackoverflow.com/questions/18160548/handling-quorum-writies-fail-on-cassandra – pinkpanther May 28 '15 at 15:10
1

At driver level, you get an exception.

On the nodes that the write succeeded, the data is actually written and it is going to be eventually rolled back.

In a normal situation, you can consider that the data was not written to any of the nodes.

From the documentation:

If the write fails on one of the nodes but succeeds on the other, Cassandra reports a failure to replicate the write on that node. However, the replicated write that succeeds on the other node is not automatically rolled back.

jorgebg
  • 6,560
  • 1
  • 22
  • 31
  • Your answer currently seems to contradict itself - you say data was rollbacked, then you quote the documentation which says "*not* automatically rolled back" ? – DNA May 27 '15 at 12:50
  • I modified the answer to try to explain myself a little better. – jorgebg May 27 '15 at 14:06
  • Wrong...It will not be rolled back...it will be retried on the failed node. – pinkpanther May 27 '15 at 16:43
  • Suppose, I have a 2 node cluster (node_A and node_B). I do an insert when node_B is down with Consistency 2. As expected insert failed. Since node_A was up, the insert would have gone to node_A. So I should be able to read from node_A with consistency 1. I tried this. But I didn't get the inserted record. – Arun Joy Thekkiniyath May 28 '15 at 04:12
  • @ArunJoyThekkiniyath What's your replication factor and how are you saying that it at least wrote to one node? You define how many nodes to write using replication factor, but not consistency level. – pinkpanther May 28 '15 at 15:20
  • @pinkpanther, My replication factor is 2. CREATE KEYSPACE samosa_ks_2 WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '2'} AND durable_writes = true; – Arun Joy Thekkiniyath May 29 '15 at 04:55
  • @ArunJoyThekkiniyath Can you try with NetworkTopologyStrategy? Do you have two dataceneters? – pinkpanther May 30 '15 at 06:37
  • @ArunJoyThekkiniyath Did you get empty records or did you get any error while reading? You would better off writing a new question with this exact problem along with the topology, queries and all. – pinkpanther May 30 '15 at 06:40
  • Thanks @pinkpanther, I got it. Cassandra will not even attempt to write if it knows that consistency cannot be met. If consistency CAN be met, but does not have enough replicas to satisfy replication factor, then Cassandra would write to currently available replicas and gives a success message. Later when the replica is up again, it will write to other replica. – Arun Joy Thekkiniyath May 30 '15 at 06:59
  • For e.g. If Replication factor is 3 , 1 of 3 nodes are down, then if I write with a Consistency of 2, the write will succeed. But if Replication factor is 2 and 1 of 2 nodes are down , then if I write with a Consistency of 2, Cassandra will not even write to that single node which is available. – Arun Joy Thekkiniyath May 30 '15 at 07:00
  • I believe what is mentioned in the documentation is a case where while write was initiated when the consistency can be met. But in between, one node went down and couldn't complete the write, whereas write succeeded in other node. Since consistency cannot be met, client would get a failure message. The record which was written to a single node would be removed later during node repair or compaction. – Arun Joy Thekkiniyath May 30 '15 at 07:06
  • @ArunJoyThekkiniyath It's interesting to know but I'm still skeptical...can you give any reference? or possibly write a question to confirm this? – pinkpanther May 30 '15 at 07:13
  • @ArunJoyThekkiniyath It seems you are correct indeed http://docs.datastax.com/en/cassandra/2.0/cassandra/dml/dml_about_hh_c.html Please write an answer. Thanks for the question. – pinkpanther May 30 '15 at 07:17
  • I'm still confused by the No rollbacks section of this link which talks about partial writes: http://mighty-titan.blogspot.in/2012/06/understanding-cassandras-consistency.html – pinkpanther May 30 '15 at 07:27
  • @pinkpanther , posted answer. I will post another question to get to know how Cassandra actually removes this record as there is no rollback. – Arun Joy Thekkiniyath May 30 '15 at 07:41
  • @ArunJoyThekkiniyath It will not remove, it will not just initiate the writes if consistency cannot be met. – pinkpanther May 30 '15 at 07:49