0

I have two node cluster on AWS. Everything was working fine until yesterday. Today I came across a problem when I run nodetool status then the following error appears. Node1 thinks Node2 is down and vice versa.

From ip2

ip2$ nodetool status
Datacenter: DC1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address        Load       Tokens       Owns    Host ID                               Rack
DN  <ip1>  ?          256          ?       27c91f95-4b58-492b-a16e-d9b99867a505  r1
Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address        Load       Tokens       Owns    Host ID                               Rack
UN <ip2>  9.11 GiB   256          ?       e628324d-34dd-4c9c-a53d-99abfacb54af  rack1

Note: Non-system keyspaces don't have the same replication settings, effective ownership information is meaningless

From ip1

ip1$ nodetool status
Datacenter: DC1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address        Load       Tokens       Owns    Host ID                               Rack
DN  <ip2>  ?          256          ?       e628324d-34dd-4c9c-a53d-99abfacb54af  r1
Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address        Load       Tokens       Owns    Host ID                               Rack
UN  <ip1>  9.14 GiB   256          ?       27c91f95-4b58-492b-a16e-d9b99867a505  rack1

Note: Non-system keyspaces don't have the same replication settings, effective ownership information is meaningless

As per last line there is some replication setting problem but I am not able to figure this out. Please suggest.

WARN  [OptionalTasks:1] 2017-08-08 15:33:37,223 CassandraRoleManager.java:344 - CassandraRoleManager skipped default role setup: some nodes were not ready
INFO  [OptionalTasks:1] 2017-08-08 15:33:37,223 CassandraRoleManager.java:383 - Setup task failed with error, rescheduling
INFO  [HANDSHAKE-/172.15.14.106] 2017-08-08 15:33:37,340 OutboundTcpConnection.java:515 - Handshaking version with /172.15.14.106
Avinash
  • 2,093
  • 4
  • 28
  • 41
  • It seems the nodes are not joined in your cluster correctly. E.g. from nodetool status of ip2 we can see ip1 is from DC1 and it is down, whereas nodetool status from ip1 shows ip2 is from DC1 and it is down. So the issue might be in your configuration. Provide output of nodetool describecluster, What snitch and topology configurations – Shoban Sundar Aug 08 '17 at 06:18
  • Cluster Information: Name: TestCassandra Snitch: org.apache.cassandra.locator.DynamicEndpointSnitch Partitioner: org.apache.cassandra.dht.Murmur3Partitioner Schema versions: 7fad977f-ddb3-3d46-91b7-be8799e669da: [ip1] UNREACHABLE: [] – Avinash Aug 08 '17 at 06:34
  • Can you ping ip2 from ip1 host?. When you execute nodetool describecluster it should give same schema version for all your nodes – Shoban Sundar Aug 08 '17 at 06:36
  • Yes, As I said everything was working fine until Yesterday. – Avinash Aug 08 '17 at 06:37

0 Answers0