Cassandra Read Timeouts on Specific Servers

Asked Oct 25 '17 at 11:11

Active Oct 25 '17 at 11:11

Viewed 67 times

We have a five node Cassandra cluster with replication factor 3. We are experiencing a lot of Read Timeouts in our application. When we checked tpstats on each Cassandra node, we see that three of the nodes have a lot of Read request drops and a high CPU utilisation, whereas on the other two nodes Read request drops are zero and CPU utilisation is moderate. Note that the total number of Read requests on all servers are almost same.

After taking thread dump we found out that the reason for high CPU utilisation is that Parallel GC is running a lot on the three nodes compared to the other two nodes, which is causing CPU utilisation to go high. What we are not able to understand is why GC should be running more on three nodes and less on two nodes, when the distribution of our partition key and our queries is almost uniform.

Cassandra version is 2.2.3.

asked Oct 25 '17 at 11:11

Pranav Pal

Do you have lots of tombstones on that nodes? – Horia Oct 25 '17 at 11:26
We have many tombstones, but the number is same on all nodes. – Pranav Pal Oct 25 '17 at 12:56
Did you check also the hints? – Horia Oct 25 '17 at 13:17
1

You could also look what kind of queries are being executed on those node. Some queries that take long time to execute will return timeout to the client, but they will still be running on the cluster. – Horia Oct 25 '17 at 13:31

Cassandra Read Timeouts on Specific Servers

0 Answers0