1

I have a question regarding the WAL + memtable writes in Cassandra. By default until WAL and memtable are both updated, Cassandra doesn't mark the write as complete. However, if the WAL succeeds and the memtable write fails, isn't Cassandra in inconsistent state?

I mean the memtable is volatile, if the memory crashes it is rebuilt from the WAL. So if a write which was only successful in WAL but not memtable, won't it mistakenly show up in memtable if it is generated from WAL?

user12331
  • 486
  • 7
  • 22

1 Answers1

3

In all my time working with users/developers/customers working on hundreds of clusters, I've never come across a situation where a memtable update failed but the mutation was persisted to the commit log.

You haven't provided any details on why you think it is possible at all or how to replicate the issue. If you do, I'd be happy to update my answer. Cheers!

Erick Ramirez
  • 13,964
  • 1
  • 18
  • 23
  • This is just based upon reading the literature https://cassandra.apache.org/doc/latest/cassandra/architecture/storage_engine.html. "Commitlogs are an append only log of all mutations local to a Cassandra node. Any data written to Cassandra will first be written to a commit log before being written to a memtable". This says the data is stored simultaneously https://www.red-gate.com/simple-talk/blogs/understanding-data-modifications-in-cassandra/. I am trying to understand what prevents the write to happen in WAL but not in memtable since they are two separate writes? – user12331 Mar 23 '22 at 06:04
  • I'm confused. I don't see where it says "prevents the write to happen in WAL". – Erick Ramirez Mar 23 '22 at 06:21
  • In the above statement "Commitlogs are an append only log of all mutations ........ before being written to a memtable", when it says data is written to commit log first which is ok. But then it will be written to memtable before calling the write successful. But during that time if the node crashes, the write happened in WAL but not in memtable because of node crash. The write is unsuccessful as far as the client is concerned. Now when the node comes back up, the memtable will load itself using the WAL. So wouldn't it mistakenly load the last entry for which it sent a write failure? – user12331 Mar 23 '22 at 06:38
  • Lets assume that it happened for all the replicas for the given key during that write. So isn't there a chance that the data is in wrong state? I am just trying to understand this scenario and how something like is prevented. I might be thinking wrong – user12331 Mar 23 '22 at 06:40
  • @user12331 this is getting out-of-hand a bit. It would be better if we could quickly discuss in StackOverflow's chat room -- https://chat.stackoverflow.com/rooms/243217/so-71579583-user12331. – Erick Ramirez Mar 23 '22 at 07:29
  • Your assumptions are incorrect. There is no scenario where a mutation gets persisted to the commit log without being written to the memtable. Cheers! P.S. I left the SO chat room after an hour because I didn't see you join. – Erick Ramirez Mar 23 '22 at 08:26
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/243251/discussion-between-user12331-and-erick-ramirez). – user12331 Mar 23 '22 at 17:04
  • Sorry I didn't see that message yesterday. Thanks for waiting. I will be in the chat today to discuss – user12331 Mar 23 '22 at 17:05