0

Can anyone explain why there could be read timeouts happening in my Cassandra cluster every 4-5 minute interval. I could almost see this consistently and be able to reproduce this consistently.

The read throughput was kept at 1 request per second, with no writes or compaction happening in the cluster. The read time out is configured as 5 seconds in the cassandra.yaml file. I am using the datastax java driver 2.1.14.

Tracing during timeout does not work. Doing the read manually the second time in cqlsh succeeds for the same row key. Looked at this https://issues.apache.org/jira/browse/CASSANDRA-8194 jira which looks similar to the issue I am seeing. We use dse 4.6.2 which has this fix. Tried disabling the auth in my cluster, but seeing the same 4-5 minute read timeout.

Please suggest ways to troubleshoot this.

Thanks, Srivatsan

  • Your question seems pretty raw but I would doubt your schema design for this . Some more questions would be , on whether you have a socket timeout on the client side which is less than the server side. 2.1.4 did have a bug related to this this but your timeout seems to be very low for a distributed system. – Ananth May 24 '15 at 07:12
  • The schema is pretty straightforward. Just doing a select query on the row key. The row is not big, probably 1 or 2 KBs. socket read timeout is set to 60 seconds. I am using 2.0.10 driver. Tried out the latest speculative execution feature in the driver, that does not help me either. An explicit retry from the client results in no timeout most of the time (around 80% of the time). I am suspecting an external factor for the timeout. – Srivatsan Nallazhagappan Jun 17 '15 at 12:26

1 Answers1

0

I had the same problem which has been resolved after setting read_request_timeout_in_ms (ms) to 10,000.

Abhinandan Satpute
  • 2,558
  • 6
  • 25
  • 43