Cassandra delayed / denied updates

Question

I'm having trouble with a small Cassandra cluster that used to work well. I used to have 3 nodes. When I added the 4th, I started seeing some issues with values not updating, so I did nodetool repair (a few times now) on the entire cluster. I should mention I did the switch at the same time as the upgrade from python-cql to the new python cassandra driver.

Essentially the weirdness falls into two cases: Denied Updates:

cqlsh:analytics> select * from metrics where id = '36122cc69a7a12e266ab40f5b7756daee75bd0d2735a707b369302acb879eedc';

(0 rows)


Tracing session: 19e5bbd0-d172-11e3-a039-67dcdc0d02de

 activity                                                                                                                 | timestamp    | source         | source_elapsed
--------------------------------------------------------------------------------------------------------------------------+--------------+----------------+----------------
                                                                                                   execute_cql3_query | 20:49:34,221 | 10.128.214.245 |              0
 Parsing select * from metrics where id = '36122cc69a7a12e266ab40f5b7756daee75bd0d2735a707b369302acb879eedc' LIMIT 10000; | 20:49:34,221 | 10.128.214.245 |            176
                                                                                                  Preparing statement | 20:49:34,222 | 10.128.214.245 |            311
                                                                                   Sending message to /10.128.180.108 | 20:49:34,222 | 10.128.214.245 |            773
                                                                                Message received from /10.128.214.245 | 20:49:34,224 | 10.128.180.108 |             67
                                                                                                        Row cache hit | 20:49:34,225 | 10.128.180.108 |            984
                                                                                   Read 0 live and 0 tombstoned cells | 20:49:34,225 | 10.128.180.108 |           1079
                                                                                Message received from /10.128.180.108 | 20:49:34,227 | 10.128.214.245 |           5760
                                                                                Enqueuing response to /10.128.214.245 | 20:49:34,227 | 10.128.180.108 |           3045
                                                                             Processing response from /10.128.180.108 | 20:49:34,227 | 10.128.214.245 |           5942
                                                                                   Sending message to /10.128.214.245 | 20:49:34,227 | 10.128.180.108 |           3302
                                                                                                     Request complete | 20:49:34,227 | 10.128.214.245 |           6282


cqlsh:analytics> update metrics set n = n + 1 where id = '36122cc69a7a12e266ab40f5b7756daee75bd0d2735a707b369302acb879eedc';

Tracing session: 20845ff0-d172-11e3-a039-67dcdc0d02de

 activity                                                                                                            | timestamp    | source         | source_elapsed
---------------------------------------------------------------------------------------------------------------------+--------------+----------------+----------------
                                                                                              execute_cql3_query | 20:49:45,328 | 10.128.214.245 |              0
 Parsing update metrics set n = n + 1 where id = '36122cc69a7a12e266ab40f5b7756daee75bd0d2735a707b369302acb879eedc'; | 20:49:45,328 | 10.128.214.245 |            129
                                                                                             Preparing statement | 20:49:45,328 | 10.128.214.245 |            227
                                                                               Determining replicas for mutation | 20:49:45,328 | 10.128.214.245 |            298
                                                                      Enqueuing counter update to /10.128.194.70 | 20:49:45,328 | 10.128.214.245 |            425
                                                                               Sending message to /10.128.194.70 | 20:49:45,329 | 10.128.214.245 |            598
                                                                           Message received from /10.128.214.245 | 20:49:45,330 |  10.128.194.70 |             37
                                                                                  Acquiring switchLock read lock | 20:49:45,331 |  10.128.194.70 |            623
                                                                            Message received from /10.128.194.70 | 20:49:45,331 | 10.128.214.245 |           3335
                                                                                          Appending to commitlog | 20:49:45,331 |  10.128.194.70 |            645
                                                                         Processing response from /10.128.194.70 | 20:49:45,331 | 10.128.214.245 |           3431
                                                                                      Adding to metrics memtable | 20:49:45,331 |  10.128.194.70 |            692
                                                                              Sending message to /10.128.214.245 | 20:49:45,332 |  10.128.194.70 |           1120
                                                                                                  Row cache miss | 20:49:45,332 |  10.128.194.70 |           1611
                                                                     Executing single-partition query on metrics | 20:49:45,332 |  10.128.194.70 |           1687
                                                                                    Acquiring sstable references | 20:49:45,332 |  10.128.194.70 |           1692
                                                                                     Merging memtable tombstones | 20:49:45,332 |  10.128.194.70 |           1692
                                                                                 Key cache hit for sstable 13958 | 20:49:45,332 |  10.128.194.70 |           1714
                                                                     Seeking to partition beginning in data file | 20:49:45,332 |  10.128.194.70 |           1856
                                                                                 Key cache hit for sstable 14036 | 20:49:45,333 |  10.128.194.70 |           2271
                                                                     Seeking to partition beginning in data file | 20:49:45,333 |  10.128.194.70 |           2271
                                       Skipped 0/2 non-slice-intersecting sstables, included 0 due to tombstones | 20:49:45,333 |  10.128.194.70 |           2540
                                                                      Merging data from memtables and 2 sstables | 20:49:45,333 |  10.128.194.70 |           2564
                                                                              Read 0 live and 1 tombstoned cells | 20:49:45,333 |  10.128.194.70 |           2632
                                                                              Sending message to /10.128.195.149 | 20:49:45,335 |  10.128.194.70 |           null
                                                                            Message received from /10.128.194.70 | 20:49:45,335 | 10.128.180.108 |             43
                                                                              Sending message to /10.128.180.108 | 20:49:45,335 |  10.128.194.70 |           null
                                                                                  Acquiring switchLock read lock | 20:49:45,335 | 10.128.180.108 |            297
                                                                                          Appending to commitlog | 20:49:45,335 | 10.128.180.108 |            312
                                                                            Message received from /10.128.194.70 | 20:49:45,336 | 10.128.195.149 |             53
                                                                                      Adding to metrics memtable | 20:49:45,336 | 10.128.180.108 |            374
                                                                            Enqueuing response to /10.128.194.70 | 20:49:45,336 | 10.128.180.108 |            445
                                                                               Sending message to /10.128.194.70 | 20:49:45,336 | 10.128.180.108 |            677
                                                                           Message received from /10.128.180.108 | 20:49:45,337 |  10.128.194.70 |           null
                                                                        Processing response from /10.128.180.108 | 20:49:45,337 |  10.128.194.70 |           null
                                                                                  Acquiring switchLock read lock | 20:49:45,338 | 10.128.195.149 |           1874
                                                                                          Appending to commitlog | 20:49:45,338 | 10.128.195.149 |           1970
                                                                                      Adding to metrics memtable | 20:49:45,338 | 10.128.195.149 |           2027
                                                                            Enqueuing response to /10.128.194.70 | 20:49:45,338 | 10.128.195.149 |           2147
                                                                               Sending message to /10.128.194.70 | 20:49:45,338 | 10.128.195.149 |           2572
                                                                           Message received from /10.128.195.149 | 20:49:45,339 |  10.128.194.70 |           null
                                                                        Processing response from /10.128.195.149 | 20:49:45,339 |  10.128.194.70 |           null
                                                                                                Request complete | 20:49:45,331 | 10.128.214.245 |           3556


cqlsh:analytics> select * from metrics where id = '36122cc69a7a12e266ab40f5b7756daee75bd0d2735a707b369302acb879eedc';

(0 rows)


Tracing session: 28f1f7b0-d172-11e3-a039-67dcdc0d02de

 activity                                                                                                                 | timestamp    | source         | source_elapsed
--------------------------------------------------------------------------------------------------------------------------+--------------+----------------+----------------
                                                                                                   execute_cql3_query | 20:49:59,468 | 10.128.214.245 |              0
 Parsing select * from metrics where id = '36122cc69a7a12e266ab40f5b7756daee75bd0d2735a707b369302acb879eedc' LIMIT 10000; | 20:49:59,468 | 10.128.214.245 |            119
                                                                                                  Preparing statement | 20:49:59,468 | 10.128.214.245 |            235
                                                                                   Sending message to /10.128.180.108 | 20:49:59,468 | 10.128.214.245 |            574
                                                                                Message received from /10.128.214.245 | 20:49:59,469 | 10.128.180.108 |             49
                                                                                                       Row cache miss | 20:49:59,470 | 10.128.180.108 |            817
                                                                          Executing single-partition query on metrics | 20:49:59,470 | 10.128.180.108 |            877
                                                                                         Acquiring sstable references | 20:49:59,470 | 10.128.180.108 |            888
                                                                                          Merging memtable tombstones | 20:49:59,470 | 10.128.180.108 |            938
                                                                                       Key cache hit for sstable 5399 | 20:49:59,470 | 10.128.180.108 |           1025
                                                                          Seeking to partition beginning in data file | 20:49:59,470 | 10.128.180.108 |           1033
                                                                                Message received from /10.128.180.108 | 20:49:59,471 | 10.128.214.245 |           3378
                                            Skipped 0/3 non-slice-intersecting sstables, included 0 due to tombstones | 20:49:59,471 | 10.128.180.108 |           1495
                                                                             Processing response from /10.128.180.108 | 20:49:59,471 | 10.128.214.245 |           3466
                                                                           Merging data from memtables and 1 sstables | 20:49:59,471 | 10.128.180.108 |           1507
                                                                                   Read 0 live and 1 tombstoned cells | 20:49:59,471 | 10.128.180.108 |           1660
                                                                                   Read 0 live and 0 tombstoned cells | 20:49:59,471 | 10.128.180.108 |           1759
                                                                                Enqueuing response to /10.128.214.245 | 20:49:59,471 | 10.128.180.108 |           1817
                                                                                   Sending message to /10.128.214.245 | 20:49:59,471 | 10.128.180.108 |           1977
                                                                                                     Request complete | 20:49:59,471 | 10.128.214.245 |           3638

So this is pretty straight forward. From y reading, it might have to do with tombstones timestamps that would have somehow gotten messed up. However, it's been several days and "now" still hasn't caught up to the future timestamp of the tombstone. Is there any way to just reset every single timestamp of the entire cluster to 0 while activity is stopped, I can live with slightly inacurate data right now.

The second issue is on some of my tables. Some tables reflect changes instantly, but for others, the changes will get reflected between 30 minutes to an hour later. I can't figure out how timestamps might relate to this.

I've synced all nodes of my cluster using NTP, not the most precise, but won't be out of sync in the scale of days or anything. All the nodes have been synced like this from the beginning, at no point did I have wildly out of sync times.

Can anybody help? As I was saying, by this point I'd settle for shutting down access to the cluster and resetting all timestamps to 0, I don't care about getting some of the order wrong, I just want this thing to work.

Thanks

score 1 · Answer 1 · answered May 07 '14 at 14:42

1

Timestamps are immutable. You'd have to truncate the table and rebuild it. The easiest way to rebuild is to just insert correct data, but if that's not an option, you can round trip through sstable2json -> edit timestamps -> json2sstable.

answered May 07 '14 at 14:42

jbellis

19,347
2
38
47

Cassandra delayed / denied updates

1 Answers1