4

I am just trying to learn Cassnadra, I am doing simple exercise of setting two node cluster, but have difficulties - it neve worked so far. Cassandra Version: 2.1.1.

Host OS: Centos 6.5 64-bit

Java: 8 (Oracle)

Number of Nodes: 2

Node addresses: 192.168.0.41 and 192.168.0.43 (Static)

Ports open on firewalls on both boxes: 7000, 9042, 9160, 7199

I did the following to setup cluster:

Changed cluster_name on both boxes to "MyCluster", both in cassandra.yaml and in tables as described here:

cassandra - Saved cluster name Test Cluster != configured name

Changed listen_address to 192.168.0.41 and 192.168.0.43 respectivelly.

Changed rpc_address to 192.168.0.41 and 192.168.0.43 respectivelly.

On 41 I set "seeds: 192.168.0.43"

On 43 I set "seeds: 192.168.0.43" (same as on 41)

Each node works by itself (when the other is down), it starts and responds to

nodetool status

just fine and keep running, I can also connect with cqlsh and run

describe keyspaces;

That works too. But when I run both nodes simultaneously one of them dies after a minute or two.

The exact symptoms are: the node still responds to cqlsh command describe keyspaces fine, so it is kind of alive, But when trying to do nodetool status following error is printed on nodetool output:

error: No nodes present in the cluster. Has this node finished starting up?
-- StackTrace --
java.lang.RuntimeException: No nodes present in the cluster. Has this node finished starting up?
    at org.apache.cassandra.dht.Murmur3Partitioner.describeOwnership  
         (Murmur3Partitioner.java:130)
         ....

The other node continues to run fine and it keeps reporting 100% ownership by itself as the only node in cluster.

Here is system.log portion of 43 around time it "died":

WARN  [GossipStage:1] 2014-11-17 04:33:30,163 TokenMetadata.java:198 - Token -7592767110844961279 changing ownership from /192.168.0.43 to /192.168.0.41
WARN  [GossipStage:1] 2014-11-17 04:33:30,163 TokenMetadata.java:198 - Token -7240492143116021720 changing ownership from /192.168.0.43 to /192.168.0.41
WARN  [GossipStage:1] 2014-11-17 04:33:30,163 TokenMetadata.java:198 - Token -8434936427655644773 changing ownership from /192.168.0.43 to /192.168.0.41
WARN  [GossipStage:1] 2014-11-17 04:33:30,163 TokenMetadata.java:198 - Token -1656745619022636889 changing ownership from /192.168.0.43 to /192.168.0.41
WARN  [GossipStage:1] 2014-11-17 04:33:30,163 TokenMetadata.java:198 - Token -7470625165291146007 changing ownership from /192.168.0.43 to /192.168.0.41
INFO  [HANDSHAKE-/192.168.0.41] 2014-11-17 04:33:30,230 OutboundTcpConnection.java:427 - Handshaking version with /192.168.0.41
INFO  [GossipTasks:1] 2014-11-17 04:33:49,179 Gossiper.java:906 - InetAddress /192.168.0.41 is now DOWN
INFO  [HANDSHAKE-/192.168.0.41] 2014-11-17 04:33:50,190 OutboundTcpConnection.java:427 - Handshaking version with /192.168.0.41
INFO  [SharedPool-Worker-1] 2014-11-17 04:34:30,224 Gossiper.java:892 - InetAddress /192.168.0.41 is now UP
INFO  [CompactionExecutor:5] 2014-11-17 04:41:01,178 CompactionManager.java:521 - No files to compact for user defined compaction
INFO  [CompactionExecutor:6] 2014-11-17 04:51:01,187 CompactionManager.java:521 - No files to compact for user defined compaction

Any idea what could be wrong? Thank you

Community
  • 1
  • 1
henry
  • 607
  • 2
  • 8
  • 15

5 Answers5

6

Refer: How do you fix node token collision issues when starting up Cassandra nodes in a cluster on VMWare?

"Make sure to delete the Location Info directories, which contain the data about the cluster"

I deleted the following folders then it works fine

  1. /home/db/cassandra/apache-cassandra-2.1.2/data/data
  2. /home/db/cassandra/apache-cassandra-2.1.2/data/commitlog
  3. /home/db/cassandra/apache-cassandra-2.1.2/data/saved_caches
Community
  • 1
  • 1
Jobin
  • 6,506
  • 5
  • 24
  • 26
1

I am not sure a reccursive seed is a good thing. Try removing seed on 43 "I set "seeds: 192.168.0.43"".

G Quintana
  • 4,556
  • 1
  • 22
  • 23
0

It seems your config is correct. Let's try the following:

Start 43 first (the seed node)

After 43 finishes starting up, start 41.

Rocherlee
  • 2,536
  • 1
  • 20
  • 27
  • Started 43. Made sure it is running and responding. Waited 2 minutes. Checked 43 again - all good. Started 41. Made sure it is running and responding. After about 15 seconds checked 43 again it was dead with the symptoms described above. 41 kept running as the only node in cluster – henry Nov 17 '14 at 09:41
  • @henry could you also check (and post relevant portions of) the end of /var/log/cassandra/system.log? That should give more of a clue as to why node 43 is having trouble. – BrianC Nov 17 '14 at 16:30
  • @henry I don't see any obvious cause in that log snippet. Two thoughts: 1) very low memory, causing Cassandra to close out without meaningful errors - how much memory do these 2 nodes have? 2) erase all Cassandra data and restart again 43, then 41 (assuming this is a dev system and safe to delete) – BrianC Nov 18 '14 at 06:28
  • @henry Maybe when the second node is starting, check whether the first node is streaming data to it using `nodetoolnetstats` – G Quintana Nov 19 '14 at 20:45
  • @GQuintana I ran "nodetool netstats" It came back with "Mode: Normal. Not Sending Any streams." Plus some more stats. What does this tell us? Thanks – henry Dec 04 '14 at 06:02
  • @Brian each node has 2G memory. but Cassandra does not close out - it continues to run, problem is: cluster does not form – henry Dec 04 '14 at 06:20
0

I'm also new to Cassandra, and I met exactly the same error as you described above, too.

My environment:

Host OS: Centos 6.5 64-bit

Cassandra: 2.1.2, raw binary package (not rpm installed)

Java 7 (Oracle)

Firewall closed

two nodes in the same LAN

I've also tried many times but this problem just cannot be solved. Finally, I decided to delete both the current Cassandra binaries and begin from a total new extracted package.

Surprisingly, this works.

I then re-do all my configurations and start Cassandra, no problem happens this time. Both nodes are started and the cluster is formed successfully.

I know this can hardly be called as a "solution" to this problem, but I just want to share my experience here. I'm wondering maybe some cached information caused this problem?

Community
  • 1
  • 1
Flickerlight
  • 904
  • 8
  • 18
0

It is because of metadata about the cluster still exists. So clear the files of local and peer under the default directory : metadata_directory: /var/lib/cassandra/metadata OR The path of metadata directory mentioned in the cassandra.yaml. And then start the cassandra service.