2

I have two tables

CREATE TABLE IF NOT EXISTS QueueBucket (
    queueName   text,
    bucketId    int,
    scheduledMinute timestamp,
    scheduledTime timestamp,
    messageId   uuid,
    PRIMARY KEY ((queueName, bucketId, scheduledMinute), scheduledTime, messageId)
)  WITH compaction = { 'class' :  'LeveledCompactionStrategy'  } AND speculative_retry='NONE' ;

CREATE TABLE IF NOT EXISTS InDelivery (
    queueName       text,
    nodeId        uuid,
    dequeuedMinute    timestamp,
    messageId       uuid,
    bucketid        int,
    dequeuedTime    timestamp,
    PRIMARY KEY ((queueName, nodeId,bucketId, dequeuedMinute),dequeuedTime, messageId)
);

In the code I perform insert into QueueBucket and delete from indelivery in batch (logged). But during load testing some how the delete from indelivery sometimes doesn't work although insert into QueueBucket works. To confirm this is applied a read from indelivery check immediately after which read the deleted messageId if the messageId still exist prints WARN log .

    queueDao.insertMsgInfo(queueName, bucketId, QueueUtils.getMinute(scheduledTime), scheduledTime, messageId);
    queuDao.deleteInDelivery(queueName, nodeId, bucketId, bucketMinute, dequeuedTime, messageId);
    if(queueServiceMetaDao.hasIndeliveryMessage(inDeliveryPK)) {
        log.warn("messageId  {} of queue {} bucket {} with node {} dequuedTime {} dequeud minute {} could not get deleted from indelivery.",
                messageId,queueName,bucketId, nodeId,QueueUtils.dateToString(dequeuedTime),QueueUtils.dateToString(bucketMinute));
        }

in insertMsgInfo and deleteInDelivery methods i am reusing prepared statement .

"INSERT INTO queuebucket (queuename, bucketid , scheduledminute, scheduledtime, messageid ) VALUES ( ? , ? , ? , ? , ? );"
"DELETE FROM indelivery WHERE queuename = ? AND nodeId = ? AND bucketId=? AND dequeuedMinute=? AND dequeuedTime =? AND messageId=? ;"

in hasIndeliveryMessage I am passing the same values wrapping into inDeliveryPrimaryKey as I passed for deleting indelivery data in moveBackToQueueBucket method.

"SELECT messageId FROM indelivery WHERE queuename = ? AND nodeId = ? AND bucketId=? AND dequeuedMinute=? AND dequeuedTime=? AND messageId=? ;"

I am no clue why i see multple warn message "could not get deleted from indelivery." . Please help

I am using cassandra version 2.2.7 it is 6 node cassandra cluster with replication factor 5 and read and write consistency used is QUORUM.

I also gone through link Cassandra - deleted data still there and https://issues.apache.org/jira/browse/CASSANDRA-7810 but this issue is fixed long time ago.in 2.0.11.

Futher update As per Cassandra - Delete not working i also ran nodetool repair but the problem still persist. Should i run compact as well ?

Futher update: I am no more using batch i do simple insert into queuebucket and delete for indelivery and then read the data but still the problem persist

Adding some logs:

2016-07-19 20:39:42,440[http-nio-8014-exec-12]INFO  QueueDaoImpl -deleting from indelivery queueName pac01_deferred nodeid 1349d57f-28f5-37d4-9fe1-dfa14dba4a9f bucketId 382 dequeuedMinute 20160719203900000 dequeuedTime 20160719203942310 messageId cc4fb158-f61e-345b-8dcf-3f842fe52d50:
2016-07-19 20:39:42,442[http-nio-8014-exec-12]INFO  QueueDaoImpl -Reading from indelivery : queue pac01_deferred nodeId 1349d57f-28f5-37d4-9fe1-dfa14dba4a9f dequeueMinute 20160719203900000 dequeueTime 20160719203942310 messageid cc4fb158-f61e-345b-8dcf-3f842fe52d50 bucketId 382 indeliveryRow Row[cc4fb158-f61e-345b-8dcf-3f842fe52d50]
2016-07-19 20:39:42,442[http-nio-8014-exec-12]WARN  QueueImpl -messageId  cc4fb158-f61e-345b-8dcf-3f842fe52d50 of queue pac01_deferred bucket 382 with node 1349d57f-28f5-37d4-9fe1-dfa14dba4a9f dequuedTime 20160719203942310 dequeud minute 20160719203900000 could not get deleted from indelivery .

Should i try will cosnsistency ALL ???

Community
  • 1
  • 1
Laxmikant
  • 1,551
  • 1
  • 13
  • 30
  • check if timestamps are in sync between all the nodes – undefined_variable Jul 19 '16 at 09:48
  • All cassandra nodes are on same timezone. – Laxmikant Jul 19 '16 at 10:06
  • Do you specify timestamps for insert and delete operations? It is a good practice to avoid wrong order of mutation operation in Cassandra.Please check for details https://datastax.github.io/java-driver/2.1.7/features/query_timestamps/ – Mikhail Baksheev Jul 19 '16 at 14:36
  • Thanks so much for reply. No I am not specifying any timestamps while insert and delete from indelivery table. As I am new to cassandra I want understand its significance and will apply it as well.. lets see if it solves my problem. If possible please provide same good link to understand the time-stamp concept. – Laxmikant Jul 19 '16 at 15:00
  • @Laxmikant, http://www.planetcassandra.org/blog/an-introduction-to-using-custom-timestamps-in-cql3/ here is a good explanation of timestamps – Mikhail Baksheev Jul 19 '16 at 16:30

2 Answers2

1

First of all, using Cassandra to support queues or queue-like structures is a known anti-pattern. If your queue deals with high throughput, you're going to be fighting with tombstones and degrading query performance.

As for your actual problem, I have seen this happen before with models that use timestamps as keys. How are you creating the timestamp values for dequeuedMinute and dequeuedTime?

If you are putting the timestamps together yourself, then deleting them should be easy. However, if you are creating them with dateOf(now()) or Java.Util.Date then your timestamps will have milliseconds stored with them. Although cqlsh will mask this from you:

INSERT INTO InDelivery (queuename, nodeid, bucketid , dequeuedMinute, dequeuedTime, messageid )
VALUES ('test1',uuid(),2112,dateof(now()),dateof(now()),uuid());

INSERT INTO InDelivery (queuename, nodeid, bucketid , dequeuedMinute, dequeuedTime, messageid )
VALUES ('test1',a24e056a-94fa-4aee-b3a7-a8df6060091a,2112,'2016-07-19 09:57:16-0500','2016-07-19 09:57:16-0500',uuid());

SELECT queuename,nodeid,dequeuedMinute,blobasbigint(timestampasblob(dequeuedMinute)),             
dequeuedTime,blobasbigint(timestampasblob(dequeuedTime)),messageid
FROM InDelivery;

 queuename | nodeid                               | dequeuedMinute                | blobasbigint(timestampasblob(dequeuedMinute)) | dequeuedTime             | blobasbigint(timestampasblob(dequeuedTime)) | messageid
-----------|--------------------------------------+-------------------------------+-----------------------------------------------+--------------------------+--------------------------------------+---------------------------------------------
     test1 | a24e056a-94fa-4aee-b3a7-a8df6060091a | 2112 2016-07-19 09:57:16-0500 |                                 1468940236000 | 2016-07-19 09:57:16-0500 |                               1468940236000 | 7ca1f676-9034-45ba-bb3f-377ba74cc5c0
     test1 | a24e056a-94fa-4aee-b3a7-a8df6060091a | 2112 2016-07-19 09:57:16-0500 |                                 1468940236641 | 2016-07-19 09:57:16-0500 |                               1468940236641 | 9721d96e-d6f5-43a7-9ba4-18ef4d54ab8a
(2 rows)

Those timestamps look the same, right? But applying the blobasbigint(timestampasblob( nested functions reveals the difference (000 vs. 641 milliseconds).

Notice that if I alter my SELECT to filter on the 641 milliseconds (the last 3 digits in the blobasbigint(timestampasblob( columns) I get the row that has milliseconds.

SELECT queuename,nodeid,dequeuedMinute,blobasbigint(timestampasblob(dequeuedMinute)),             
dequeuedTime,blobasbigint(timestampasblob(dequeuedTime)),messageid
FROM InDelivery
WHERE queuename='test1' AND bucketid=2112 
AND nodeid=a24e056a-94fa-4aee-b3a7-a8df6060091a
AND dequeuedMinute='2016-07-19 09:57:16.641-0500';

 queuename | nodeid                               | dequeuedMinute                | blobasbigint(timestampasblob(dequeuedMinute)) | dequeuedTime             | blobasbigint(timestampasblob(dequeuedTime)) | messageid
-----------|--------------------------------------+-------------------------------+-----------------------------------------------+--------------------------+--------------------------------------+---------------------------------------------
     test1 | a24e056a-94fa-4aee-b3a7-a8df6060091a | 2112 2016-07-19 09:57:16-0500 |                                 1468940236641 | 2016-07-19 09:57:16-0500 |                               1468940236641 | 9721d96e-d6f5-43a7-9ba4-18ef4d54ab8a
(1 rows)

The bottom line, is that if you're going to be storing milliseconds with your timestamp keys, then you also need to include them when you SELECT/DELETE by those keys. Likewise, if you are not storing milliseconds on your timestamp keys, then you cannot include them when you SELECT/DELETE by those keys.

Aaron
  • 55,518
  • 11
  • 116
  • 132
  • Thanks for reply . you are totally right. I use java.util.Date only but while inserting/deleting/select i dont ignore the millisecond part. I face this issue only during load testing when thousands of request comes in a second. So I also have same doubt as @ Mikhail Baksheev have so working on setting client side timestamp while insert and delete to maintain an order of mutation. I pray that should solve my problem :).. – Laxmikant Jul 19 '16 at 18:32
1

USING TIMESTAMP at client end solved my problem. Thanks to Mikhail Baksheev for pointing out.

It is recommended to use it from client side in query to maintain the order of mutation.

If we are inserting and deleting a data make sure the value of TIMESTAMP we are passing in delete query must be more than what we passed in insertion.

Other reasons of data deletion failure/appears to fail in Cassandara could be

  1. ignoring millisecond value in the delete for the timestamp field .
  2. Data may reappear if a node is down longer than the grace period.
Laxmikant
  • 1,551
  • 1
  • 13
  • 30