7

I have a batch statement of Cassandra that contains a delete and an insert statement of same partition key, where delete is the first statement and insert is the second. How the batch statement executes these statements ? Is in the same order in which,we added the statements?

Aaron
  • 55,518
  • 11
  • 116
  • 132
Jobs
  • 1,257
  • 2
  • 14
  • 27

1 Answers1

10

No, it does not execute them in the order specified. To force a particular execution order, you can add the USING TIMESTAMP clause. Check the docs for more information: http://docs.datastax.com/en/cql/3.1/cql/cql_reference/batch_r.html

Using time stamp how it can maintain the order of execution . For Example if the above example (delete and insert for same partition key), the final result should be the inserted record. Is that possible by adding time-stamp ??

Yes. I'll combine examples from the link above and the DELETE documentation to demonstrate, and start by creating a simple table called purchases with two fields:

CREATE TABLE purchases (user text PRIMARY KEY, balance bigint);

Next, I'll execute a batch with an INSERT and a DELETE. I'll do the DELETE last, but with an earlier timestamp than the INSERT:

BEGIN BATCH
  INSERT INTO purchases (user, balance) VALUES ('user1', -8) USING TIMESTAMP 1432043350384;
  DELETE FROM purchases USING TIMESTAMP 1432043345243 WHERE user='user1';
APPLY BATCH;

When I query for userid:

aploetz@cqlsh:stackoverflow2> SELECT user, balance, writetime(balance) FROM purchases WHERE user='user1';

 user  | balance | writetime(balance)
-------+---------+--------------------
 user1 |      -8 |      1432043350384

(1 rows)

As you can see, the INSERT persisted because it had the latest timestamp. Whereas if I had simply run the INSERT and DELETE (in that order) from the cqlsh prompt, the query would have returned nothing.

Aaron
  • 55,518
  • 11
  • 116
  • 132
  • In your `BATCH` example, say you omitted the timestamp, will Cassandra guarantee that the row for `user1` is deleted? What wins over if they have the same timestamp (in this case calculated by the server)? – Sotirios Delimanolis Nov 17 '16 at 20:20
  • [Here we go.](https://datastax-oss.atlassian.net/browse/JAVA-237) Tombstone will always take precedence over regular columns. Also described [here](https://issues.apache.org/jira/browse/CASSANDRA-6426). – Sotirios Delimanolis Nov 17 '16 at 20:53
  • @SotiriosDelimanolis I like the "beyond silly" comment in the first link :) – Zheng Liu Oct 07 '21 at 11:45
  • These are the rules to resolve conflict in batch statement 1. If timestamps are different, pick the column with the largest timestamp (the value being a regular column or a tombstone) 2. if timestamps are the same, and one of the columns in a tombstone ('null') - pick the tombstone 3. if timestamps are the same, and none of the columns are tombstones, pick the column with the largest value. Ref: https://issues.apache.org/jira/browse/CASSANDRA-6426?focusedCommentId=13836059&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-13836059 – tainguyentt Jan 05 '23 at 06:34