2

I am retrieving data from Cassandra and mapping it to a class using the build in object mapping API in the java driver. After I process the data I want to delete it. My clustering key is a timestamp and it is mapped to a Date object. When I try do delete a partition it does not get deleted. I suspect that it is because of the mapping to the Date object and that some data is lost there? Have you encountered a similar problem?

The Accessor:

@Query("SELECT * FROM my_table WHERE id = ? AND event_time < ?")
    Result<MyObject> getAllObjectsByTime(UUID id, Date eventToTime);

The retrieval of the objects:

MappingManager manager = new MappingManager (_cassandraDatabaseManager.getSession());

        CassandraAccessor cassandraAccessor = manager.createAccessor(CassandraAccessor.class);
        Result<MyObject> myObjectResult = cassandraAccessor.getAllObjectsByTime(id, eventToTime);

MyObject:

@Table(keyspace = "myKeyspace", name = "my_table ")
public class MyObject
{
    @PartitionKey
    @Column(name = "id")
    private UUID id;

    @Column(name = "event_time")
    private Date eventTime;
}

The delete logic:

 PreparedStatement statement = session
                .prepare("DELETE FROM my_table WHERE id = ? AND event_time = ?;");


        BatchStatement batch = new BatchStatement();

        for (MyObject myObject: myObjects)
        {     
            batch.add(statement.bind(myObject.getStoreId(), myObject.getEventTime()));
        }

        session.execute(batch);

EDIT

After a lot of debugging I figured, that maybe the Date is not the problem. It appears that the delete is working, but not for all of the partitions. When I debug the Java application I get the following CQL statement:

DELETE FROM my_table WHERE id=86a2f31d-5e6e-448b-b16c-052fe92a87c9 AND event_time=1442491082128;

When it is executed trough the Cassandra Java Driver the partition is not deleted. If I execute it in the CQLSH console the partition is deleted. I have no idea what is happening. I am starting to suspect that there is a problem with the Cassandra Java Driver. Any ideas?


Edit 2

This is the table:

CREATE TABLE my_table(
    id uuid,
    event_time timestamp,
    event_data text,
    PRIMARY KEY (id, event_time)
) WITH CLUSTERING ORDER BY (event_time DESC)
Ivan Stoyanov
  • 5,412
  • 12
  • 55
  • 71

2 Answers2

2

I'd need to see more of your code to understand how you are issuing the delete, but perhaps you aren't specifying the timestamp to the correct precision on the delete.

Internally timestamp fields are epoch time in milliseconds. When you look at a timestamp in cqlsh, it shows the timestamp rounded down to the nearest second like this:

SELECT * from t12 where a=1 and b>'2015-09-16 12:51:49+0000';

 a | b
---+--------------------------
 1 | 2015-09-16 12:51:49+0000

So if you try to delete using that date string, it won't be an exact match since the real value is something like 2015-09-16 12:51:49.123+0000

If you show the timestamp as an epoch time in milliseconds, then you can delete it with that:

SELECT a, blobAsBigint(timestampAsBlob(b)) from t12;
 a | system.blobasbigint(system.timestampasblob(b))
---+------------------------------------------------
 1 |                                  1442407909964

DELETE from t12 where a=1 and b=1442407909964;

See this.

Community
  • 1
  • 1
Jim Meyer
  • 9,275
  • 1
  • 24
  • 49
  • Plus one from me...*exactly* what I think is happening here, too. – Aaron Sep 16 '15 at 13:27
  • I think that this is the issue. Do you know how I can implement this using the java driver and the object mapping API? I presume that I can set the eventTime property to long and use it like that. But is there a better way (to keep the date as it is)? – Ivan Stoyanov Sep 16 '15 at 13:48
  • When you do the delete, you might need to do myObject.getEventTime().getTime() in your bind statement so that it uses the epoch in milliseconds rather than some default conversion. – Jim Meyer Sep 16 '15 at 14:09
  • When I try that I get the following exception `com.datastax.driver.core.exceptions.InvalidTypeException: Invalid type for value 1 of CQL type timestamp, expecting class java.util.Date but class java.lang.Long provided` – Ivan Stoyanov Sep 16 '15 at 14:17
  • Then I'm not sure how to fix it. I don't usually try to delete individual timestamps and generally use TTL to handle deletes. – Jim Meyer Sep 16 '15 at 14:21
  • After further investigation I think that the milliseconds are not the problem. If I use `QueryBuilder` and create a `Delete.Where` query with `myObject.getEventTime()` the query looks like this: `DELETE FROM my_table WHERE id=86a2f31d-5e6e-448b-b16c-052fe92a87c9 AND event_time=1442412817850;`. I have found out some of the rows are deleted successfully and some of them are not deleted. I'll continue looking into this problem. – Ivan Stoyanov Sep 16 '15 at 15:08
1

I have seen problems with batched statements being dropped or timing out. How many deletes are you trying to execute per batch? Try either lowering your batch size or removing batching all-together.

Remember, batch statements in Cassandra were designed to apply an update atomically to several different tables. They really weren't intended to be used to slam a few thousand updates into one table.

For a good description of how batch statements work, watch the video from (DataStax MVP) Chris Batey's webinar on Avoiding Cassandra Anti-Patterns. At 16:00 minutes he discusses what (exactly) happens in your cluster when it applies a batch statement.

Aaron
  • 55,518
  • 11
  • 116
  • 132
  • For testing purposes I have only 5-10 delete statements in my batch. I also tried using the `QueryBuilder` but the result was the same `Delete.Where delete = QueryBuilder.delete() .from("my_table") .where(QueryBuilder.eq("id", myObject.getId())) .and(QueryBuilder.eq("event_time", myObject.getEventTime()));` – Ivan Stoyanov Sep 17 '15 at 14:36