0

It is understood that yugabytedb creates n key/value records per SQL row.
Still, how does it manage to achieve high data density?

Jolta
  • 2,620
  • 1
  • 29
  • 42
AVA
  • 2,474
  • 2
  • 26
  • 41

1 Answers1

1

@AVA

We use DocDB internally which uses a heavily modified RocksDB. A summary from the doc links at the end of comment:

  1. Each table gets split into several tablets. Each tablet being a RocksDB instance. RocksDB instances use size tiered compaction whi

  2. RocksDB instances use size-tiered compaction which has low write IO amplification and higher space amplification (needs 50% free disk space). This is mitigated by having several small tablets. We get low write IO with no space amplification concerns.

  3. Global memstore limits. Big number of tablets won't have overhead in-memory since memstores are used in 1 pool.

  4. Global block-cache will handle caching on all sstables. This further reduces overhead instead of keeping per-rocksdb cache.

  5. Global Throttled compactions and small/big compaction queues will help against compaction storms overwhelming the server

  6. Striping tablet load uniformly across data disks: Each tablet can reside in a different disk (JBOD). This will loadbalance sstables & WAL IO between disks in the machine.

  7. Efficient c++ implemention code. No stop-the-world GC helps with keeping latencies low & consistent.

  8. When the server is at full usage we stop accepting new queries. This will keep the server from crashing or being overwhelmed.

  9. On-disk block compression (low read/write IO).

  10. Uncompressed in-memory block-cache: low cpu overhead on each read & able to server more hot queries.

  11. Disabling Rocksdb WAL and only keeping RAFT wal. This reduces read/write IO.

https://docs.yugabyte.com/latest/architecture/docdb/performance/

https://docs.yugabyte.com/latest/architecture/concepts/yb-tserver/

dh YB
  • 965
  • 3
  • 10