5

I am trying to understand RocksDB behavior in Kafka streams processor API. I am configuring a persistent StateStore using the default RocksDB that KStreams provide.

StoreBuilder countStoreBuilder =
  Stores.keyValueStoreBuilder(
    Stores.persistentKeyValueStore("Counts"),
    Serdes.String(),
    Serdes.Long())

I am not doing any aggregation, join, or windowing. I am just receiving records and comparing some of them to previous items in the store and storing some of the records I receive in the state store.

The developer guide mentions that you can enable record caches in the Processor API by calling .withCachingEnabled() on the above builder.

The cache "serves as a read cache to speed up reading data from a state store" - Record Caches Kafka Streams

However, my understanding is that RocksDB in persistent mode is first buffered in memory and will expand into disk only if the state doesn't fit in RAM.

RocksDB is just used as an internal lookup table (that is able to flush to disk if the state does not fit into memory RocksDB flushing is only required because state could be larger than available main-memory. Kafka Streams Internal Data Management

So how does record caches speed up the read from the state store if both are buffered in memory? It seems to me that record caches overlap with RocksDB behavior.

iah10
  • 157
  • 3
  • 12

1 Answers1

5

Your observation is correct and it depends on the use case if caching is desired on not. One big advantage of application level caching (instead of RocksDB caching) is that it reduces the number of records written into the changelog topic that is used to make the store fault-tolerant. Hence, it reduced the load on the Kafka cluster and also may reduce recovery time.

For DSL users, caching also has an impact on downstream load (something you might not be interested for you application, as it seems you are using the Processor API):

Matthias J. Sax
  • 59,682
  • 7
  • 117
  • 137
  • Thank you @matthias-j-sax for your answer. Indeed, I am using the processor API. I get the advantage of record caches "as a write-back cache that allows for batching multiple records instead of sending each record individually to the state store. It also reduces the number of requests going to a state store. This results reduced network IO to Kafka and reduced local disk IO to RocksDB-backed state stores" However, I can't see how it could reduce recovery time? – iah10 Jun 03 '19 at 08:54
  • 1
    Topic-partitions are divided into segments, and the active segment is not subject to log compaction. Hence, if you update a small set of keys very often, you get a lot of duplicates in the changelog topic that are not immediately compacted away -- hence, record caching that "de-dupes" those updates before writing to the changelog may reduce the number of duplicates in the changelog topic. – Matthias J. Sax Jun 03 '19 at 15:23