3

Hi I was trying out different configuration using the site https://www.ecyrd.com/cassandracalculator/

But I could not understand the following results show for configuration

Cluster size  3
Replication Factor  2
Write Level 1   
Read Level 1

You can survive the loss of no nodes without data loss.

For reference I have seen the question Cassandra loss of a node

But it still does not help to understand why Write level 1 will with replication 2 would make my cassandra cluster not survive the loss of no node without data loss?

A write request goes to all replica nodes and the even if 1 responds back , it is a success, so assuming 1 node is down, all write request will go to the other replica node and return success. It will be eventually consistent.

Can someone help me understand with an example.

Hiteshdua1
  • 2,126
  • 18
  • 29

3 Answers3

3

I guess what the calculator is working with is the worst case scenario.

You can survive the loss of one node if your data is available redundantly on two out of three nodes. The thing with write level ONE is, that there is no guarantee that the data is actually present on two nodes right after your write was acknowledged.

Let's assume the coordinator of your write is one of the nodes holding a copy of the record you are writing. With write level ONE you are telling the cluster to acknowledge your write as soon as the write was committed to one of the two nodes that should hold the data. The coordinator might do that before even attempting to contact the other node (to boost latency percieved by the client). If in that moment, right after acknowledging the write but before attempting to contact the second node the coordinator node goes down and cannot be brought back, then you lost that write and the data with it.

Ralf
  • 6,735
  • 3
  • 16
  • 32
  • Can you just confirm my understanding Following is the worst case scenario : 1. Cassandra gets a write request, sends it to all replica nodes. 2. A success write request is acknowledged by a single node (consistency level ONE )and then Cassandra acknowledges the write request. 3. The other replica(which also got the write request but did not acknowledge it as that time somehow failed to write that data. 4. The original node which acknowledged the request goes down. Can you help me define scenarios for Case 3 if my understanding is correct – Hiteshdua1 Aug 22 '17 at 12:26
  • @Hiteshdua1, I updated my answer to be more specific and take the coordinator into account. – Ralf Aug 22 '17 at 14:13
  • 1
    This is correct - the calculator is working with the worst case scenario and Ralf's explanation is correct. – Janne Jalkanen Nov 14 '17 at 10:06
0

When you read or write data, Cassandra computes the hash token for the data and distributes to respective nodes. When you have 3 node cluster with replication factor as 2 means your data is stored in 2 nodes. So at a point when 2 nodes are down which is responsible for a token A and this token is not part of node 3, eventually even you have one node you will still have TokenRangeOfflineException.

The point is we need replicas(Token) and not the nodes. Also see the similar question answered here.

Shoban Sundar
  • 563
  • 1
  • 8
  • 11
  • So my question is, is the statement correct "You can survive the loss of no nodes without data loss" I believe, I can survive the loss of 1 data node in the above mentioned scenario. – Hiteshdua1 Aug 22 '17 at 11:27
  • It depends on then hashed token, if your hashed token is available in the surviving node then it will be success or else it would fail. – Shoban Sundar Aug 22 '17 at 11:29
  • Won't replication factor make at least 1 node have that hashed token available? Am i missing out on something here. I believe 1 single write request goes to a coordinator node which always sends it to all replicas node, and even when 1 node is down, I still have write as success, so I believe i can survive the loss of 1 node without data loss. – Hiteshdua1 Aug 22 '17 at 11:33
  • Each node in Cassandra cluster will own a token-range which you can see in nodetool ring/status. E.g. you have node 1, 2, 3 with token A, B and C respectively. When coordinator node (say node 1) receives the data, it is hashed and token computed is say B. So coordinator node will try to write in node 2 (which owns token B ) followed by node 3(cyclic order). If node 2 and 3 (owning tokens B and C) are down then obviously your node responsible for token is not available and hence your operation fail. You must understand the token computing and assignment to the nodes. – Shoban Sundar Aug 22 '17 at 12:00
0

This is the case because the write level is 1. And if the your application is writing on 1 node only (and waiting data to get eventually consistent/sync, which is going to take non-zero time), then data can get lost if that one server itself is lost before sync could happen

r005t3r
  • 228
  • 1
  • 8
  • But Cassandra always send write request to all the replica Nodes , irrespective of the consistency level. Following is the worst case scenario : 1. Cassandra gets a write request, sends it to all replica nodes 2. A success write request is acknowledged by a single node (consistency level ONE )and then Cassandra acknowledges the write request. 3. The other replica(which also got the write request but did not acknowledge it as that time it somehow failed to write that data. (Could be usefull if you could exapnd any scenarios on this ? ) 4. The original node acknowledging it goes down. – Hiteshdua1 Aug 22 '17 at 12:14
  • I think you know the worst case scenario. Now the question is how point 3 can happen? One reason I can think of is overload scenario (when node is busy handling other request and fail to entertain few). Write request send to it will timeout in this case. Although a _hint_ might be save on coordinator node, but worst case will be that it also exceed "max_hint_window_in_ms". – r005t3r Aug 22 '17 at 14:07