4

Are there any known issues with initial_token collision when adding nodes to a cluster in a VM environment?

I'm working on a 4 node cluster set up on a VM. We're running into issues when we attempt to add nodes to the cluster.

In the cassandra.yaml file, initial_token is left blank. Since we're running > 1.0 cassandra, auto_bootstrap should be true by default.

It's my understanding that each of the nodes in the cluster should be assigned an initial token at startup.

This is not what we're currently seeing. We do not want to manually set the value for initial_token for each node (kind of defeats the goal of being dynamic..) We also have set the partitioner to random: partitioner: org.apache.cassandra.dht.RandomPartitioner

I've outlined the steps we follow and results we are seeing below. Can someone please asdvise as to what we're missing here?

Here are the detailed steps we are taking:

1) Kill all cassandra instances and delete data & commit log files on each node.

2) Startup Seed Node (S.S.S.S)

Starts up fine.

3) Run nodetool -h W.W.W.W ring and see:

Address         DC          Rack        Status State   Load            Effective-Ownership Token
S.S.S.S         datacenter1 rack1       Up     Normal  28.37 GB        100.00%             24360745721352799263907128727168388463

4) X.X.X.X Startup

 INFO [GossipStage:1] 2012-11-29 21:16:02,194 Gossiper.java (line 850) Node /X.X.X.X is now part of the cluster
 INFO [GossipStage:1] 2012-11-29 21:16:02,194 Gossiper.java (line 816) InetAddress /X.X.X.X is now UP
 INFO [GossipStage:1] 2012-11-29 21:16:02,195 StorageService.java (line 1138) Nodes /X.X.X.X and /Y.Y.Y.Y have the same token 113436792799830839333714191906879955254.  /X.X.X.X is the new owner
 WARN [GossipStage:1] 2012-11-29 21:16:02,195 TokenMetadata.java (line 160) Token 113436792799830839333714191906879955254 changing ownership from /Y.Y.Y.Y to /X.X.X.X

5) Run nodetool -h W.W.W.W ring and see:

Address         DC          Rack        Status State   Load            Effective-Ownership Token
                                                                                           113436792799830839333714191906879955254
S.S.S.S         datacenter1 rack1       Up     Normal  28.37 GB        100.00%             24360745721352799263907128727168388463
W.W.W.W         datacenter1 rack1       Up     Normal  123.87 KB       100.00%             113436792799830839333714191906879955254

6) Y.Y.Y.Y Startup

 INFO [GossipStage:1] 2012-11-29 21:17:36,458 Gossiper.java (line 850) Node /Y.Y.Y.Y is now part of the cluster
 INFO [GossipStage:1] 2012-11-29 21:17:36,459 Gossiper.java (line 816) InetAddress /Y.Y.Y.Y is now UP
 INFO [GossipStage:1] 2012-11-29 21:17:36,459 StorageService.java (line 1138) Nodes /Y.Y.Y.Y and /X.X.X.X have the same token 113436792799830839333714191906879955254.  /Y.Y.Y.Y is the new owner
 WARN [GossipStage:1] 2012-11-29 21:17:36,459 TokenMetadata.java (line 160) Token 113436792799830839333714191906879955254 changing ownership from /X.X.X.X to /Y.Y.Y.Y

7) Run nodetool -h W.W.W.W ring and see:

Address         DC          Rack        Status State   Load            Effective-Ownership Token
                                                                                           113436792799830839333714191906879955254
S.S.S.S         datacenter1 rack1       Up     Normal  28.37 GB        100.00%             24360745721352799263907128727168388463
Y.Y.Y.Y         datacenter1 rack1       Up     Normal  123.87 KB       100.00%             113436792799830839333714191906879955254

8) Z.Z.Z.Z Startup

 INFO [GossipStage:1] 2012-11-30 04:52:28,590 Gossiper.java (line 850) Node /Z.Z.Z.Z is now part of the cluster
 INFO [GossipStage:1] 2012-11-30 04:52:28,591 Gossiper.java (line 816) InetAddress /Z.Z.Z.Z is now UP
 INFO [GossipStage:1] 2012-11-30 04:52:28,591 StorageService.java (line 1138) Nodes /Z.Z.Z.Z and /Y.Y.Y.Y have the same token 113436792799830839333714191906879955254.  /Z.Z.Z.Z is the new owner
 WARN [GossipStage:1] 2012-11-30 04:52:28,592 TokenMetadata.java (line 160) Token 113436792799830839333714191906879955254 changing ownership from /Y.Y.Y.Y to /Z.Z.Z.Z

9) Run nodetool -h W.W.W.W ring and see:

Address         DC          Rack        Status State   Load            Effective-Ownership Token
                                                                                           113436792799830839333714191906879955254
W.W.W.W         datacenter1 rack1       Up     Normal  28.37 GB        100.00%             24360745721352799263907128727168388463
S.S.S.S         datacenter1 rack1       Up     Normal  28.37 GB        100.00%             24360745721352799263907128727168388463
Z.Z.Z.Z         datacenter1 rack1       Up     Normal  123.87 KB       100.00%             113436792799830839333714191906879955254

Thanks in advance.

rs_atl
  • 8,935
  • 1
  • 23
  • 28
JohnB
  • 3,921
  • 8
  • 49
  • 99

2 Answers2

4

This is what I did to fix this problem:

  1. Stop the Cassandra service
  2. Set auto_bootstrap: false on the seed node.
  3. Empty data and commitlog directories: sudo rm -rf /var/lib/cassandra/data/* sudo rm -rf /var/lib/cassandra/commitlog/*
  4. And then restart the service

I tested this with Cassandra 3.7.

Rushi Agrawal
  • 3,208
  • 2
  • 21
  • 26
3

Clearly your nodes are holding onto some past cluster information that is being used at startup. Make sure to delete the LocationInfo directories, which contain the data about the cluster. You have a very strange token layout (where's the 0 token, for example?), so you're certainly going to need to reassign them if you want the proper ownership.

It may help to explain how token assignment works, so let me also address this. In a brand new cluster, the first node will get assigned token 0 by default and will have 100% ownership. If you do not specify a token for your next node, Cassandra will calculate a token such that the original node owns the lower 50% and the new node the higher 50%.

When you add node 3, it will insert the token between the first and second, so you'll actually end up with ownership that looks like 25%, 25%, 50%. This is really important, because the lesson to learn here is that Cassandra will NEVER reassign a token by itself to balance the ring. If you want your ownership balanced properly, you must assign your own tokens. This is not hard to do, and there's actually a utility provided to do this.

So Cassandra's initial bootstrap process, while dynamic, may not yield the desired ring balance. You can't simply allow new nodes to join willy nilly without some intervention to make sure you get the desired result. Otherwise you will end up with the scenario you have laid out in your question.

rs_atl
  • 8,935
  • 1
  • 23
  • 28