-1

I am doing research on how databases are designed internally. I found that there are three main components

  1. WAL - Write ahead log
  2. Memtable - In-memory data structure say RedBlack Tree or SkipList
  3. SSTables - Files on the disk

Now consider a database like cassandra, it has sequential writes so there are no issues in updating the memtable. However suppose if a write and read are coming at the same time then how does database like cassandra works.

I am asking this because - suppose database is using the ReadBlack tree as memtable and it starts write which may cause the tree restructuring but at the same time read is also happening then its can cause inconsistencies.

Another case, suppose if database is getting the lock on redblacktree before any write then it would be huge performance degradation as there could be 1000s of read waiting for the lock to release.

Can someone help with on how does it work,

voila
  • 1,594
  • 2
  • 19
  • 38

2 Answers2

0

That's not really relevant because Cassandra does not use a locking mechanism (with the exception of lightweight transactions) so writes are non-blocking which is part of the reason they are really fast.

The mutations are persisted to the commitlog by appending which means there are no disk seeks or sorting/ordering involved.

Cassandra really is designed for high velocity throughput that supports reads/writes at internet scale. This is just one of the reasons tech giants who are household names choose Cassandra for their demanding workloads. Cheers!

Erick Ramirez
  • 13,964
  • 1
  • 18
  • 23
0

RocksDB uses a skip list capable of lock-free simultaneous read and write. Because of the invariants of the structure, individual entries cannot be reclaimed, so Deletes have to use tombstones, and an entire skip list is reclaimed after it is flushed to an SST file.

More about WAL in RocksDB write path: https://github.com/facebook/rocksdb/wiki/WAL-Performance https://github.com/facebook/rocksdb/wiki/Pipelined-Write https://github.com/facebook/rocksdb/wiki/unordered_write

Peter Dillinger
  • 2,757
  • 2
  • 14
  • 7