0

I am using Python3.4 and CQLEngine. In my code, I am saving an object in an overloaded save operator as follows:

Class Foo(Model, ...):
    id = columns.Integer(primary_key)=True
    bar = column.Text() 
    ...

    def save(self):
        super(Foo, self).save()

and I would like to know if the save() is making an insert or an update from the return of the save function.

Ryan Kuhl
  • 90
  • 7

1 Answers1

1

INSERT and UPDATE are synonyms in Cassandra with a very few exceptions. Here is a description of INSERT where it briefly touches on a difference:

An INSERT writes one or more columns to a record in a Cassandra table atomically and in isolation. No results are returned. You do not have to define all columns, except those that make up the key. Missing columns occupy no space on disk.

If the column exists, it is updated. You can qualify table names by keyspace. INSERT does not support counters, but UPDATE does. Internally, the insert and update operation are identical.

You don't know whether it will be an insert or update, and you can look at it as if it was a data save request, then the coordinator determines what it is.

This answers your original question - you can't know based on the return of the save function whether it was an insert or update.

The answer on your comment below, which explained why you wanted to have that output: You can't reliably get this info out of Cassandra, but you can use lightweight transactions to a certain extent and run 2 statements sequentially with the same rows of data:

INSERT ... IF NOT EXISTS followed by UPDATE ... IF EXISTS

In the target table you will need to have a column where each of these statements will write a value unique for each call. Then you can select data based on the primary keys of your dataset, and see how many rown have each value. This will roughly tell you how many updates and how many inserts were there. However of there were any concurrent processes, they may have overwritten your data over with their tokens, so this method will not be very accurate and will work (as any other method with databases like Cassandra) only where there are no concurrent processes.

Roman Tumaykin
  • 1,921
  • 11
  • 11
  • unfortunately for my application I need to keep a running count of inserts/updates/errors so that data uploads can be confirmed, so this approach doesn't help much – Ryan Kuhl Mar 03 '15 at 20:29
  • This is not an approach, but just an explanation of limitations of Cassandra that do not allow this type of usage. – Roman Tumaykin Mar 03 '15 at 20:49