1

We run into several data disappear issues lately. Our data is log data. It has composite key, (id, requestdate).

Our program constantly insert new records into C*. No delete operations. Data was successfully written and was able to select the data out. But after a while, data for some ids are gone.

One reason we could think of is, we use kundera cassandra driver, which has default write consistency_level set to ONE. There is no error in the system log.

Do you think this problem is caused by write consistency_level? Thanks.

Edit: We have not run node repair for a while. Could this cause the data disappear issue?

Dev
  • 13,492
  • 19
  • 81
  • 174
Yan Liu
  • 11
  • 2

2 Answers2

2

There is a chance if you make a read directly after the write and you use one of the other replicas as the coordinator that it would not have retrieved the data yet. If you need that kind of consistency in your reads do both reads/writes with CL.QUORUM. This is window is safe to assume passed within ~500ms or so. See Eventual Consistency != Hopeful Consistency

Chris Lohfink
  • 16,150
  • 1
  • 29
  • 38
  • Our problem is, write was successful and a read after write displayed the data. After one day or some time, data under some ID (partition key) are gone, can't select from any node. – Yan Liu Jan 18 '15 at 17:43
  • My guess then is that there is something either deleting it or setting TTL on the data. Another possibility is that your application maybe attempting to use the wrong partition key. – Chris Lohfink Jan 18 '15 at 20:30
  • There is no TTL setting on the data. How is possible setting the wrong partition key could cause this? Thanks. – Yan Liu Jan 19 '15 at 02:27
  • if you generate the key wrong partition key you wont get any data. – Chris Lohfink Jan 19 '15 at 04:27
  • It turns out I used Kundera Cassandra driver, which defaults to roll back the transactions if write failed. (write timeout in our case). The rollback is a delete command, which causes data deleted. Your comments remind me. Thanks. – Yan Liu Jan 29 '15 at 05:35
1

Consistency of ONE, indicates that a status is returned as soon as a write is successful on one replica node. Data should not vanish from cassandra unless, the write itself was never successful.

If the insertion is not successful because of the down nodes. In that case, check out hinted handoff. Increase the time for hinted hand off.

What is your replication factor? Maybe increase it to greater number to prevent loss of service due to a down node?

Desert Ice
  • 4,461
  • 5
  • 31
  • 58
  • Write was successful. After writing the data, data was selected and displayed. Replication is 3. Another problem we found is, nodetool repair has not run for a while. For the data loss, all the data under one ID (partition key) is lost, not only the latest data. Thanks. – Yan Liu Jan 18 '15 at 17:36
  • Any chance you guys use a TTL for your keyspace? Nodetool repair from what i know has to be run at least once gc_grace_seconds manually. – Desert Ice Jan 19 '15 at 07:24
  • Check that one ID is assigned to which particular node. Another possibility could be timestamp, where data for one particular would not be overwritten by values with older timestamp. Also, you may try running nodetool flush, refresh { unlikely scenarios }. – vivek mishra Jan 19 '15 at 07:55