0

There are 2 DCs each with 3 nodes, and the RF used for writes is 2 and reads its each_quorum. A lightweight transaction is used to ensure consistency of updates across DCs. Now what is happening is for certain records, hundreds (maybe thousands) of lwt updates are hitting the cluster around same time. What is happening is that all of these updates are failing with "Operation timed out - received only 0 responses", not even one attempt is able to change the status of that one record and its making everyone else fail. Ideally it would be better for the first attempt to go through the update and change the values so that subsequent lwt updates will not go through since the lwt values do not satisfy. Is there any way to achieve this?

Tried increasing cas_contention timeout but this not help except making all the transactions wait longer before failing. Used "local consistency" which made lwt run faster but this would not help in our case since we want strong consistency on both the DCs. Any alternatives?

nmakb
  • 1,069
  • 1
  • 17
  • 35
  • it looks like that you have high contention on the partition level, and all your queries are issued against same partition... show the table & queries that are issued – Alex Ott Jan 26 '19 at 12:04
  • ```CREATE TABLE tablename ( col1 text, col2 timestamp, col3 text, col4 text, col5 text, PRIMARY KEY (col1, col2) ) WITH CLUSTERING ORDER BY (col2 ASC)...``` – nmakb Feb 16 '19 at 02:40
  • ```UPDATE tablename USING TTL ? SET col3 = ? WHERE col1 = ? AND col2 = ? IF col4 IN ? and col3 in (null,'',?) ``` – nmakb Feb 16 '19 at 02:41

0 Answers0