Currently I have the following setup:
StoreBuilder storeBuilder = Stores.keyValueStoreBuilder(
Stores.persistentKeyValueStore("kafka.topics.table"),
new SomeKeySerde(),
new SomeValueSerde());
streamsBuilder.addStateStore(storeBuilder);
final KStream<byte[], SomeClass> requestsStream = streamsBuilder
.stream("myTopic", Consumed.with(Serdes.ByteArray(), theSerde));
requestsStream
.filter((key, request) -> Objects.nonNull(request))
.process(() -> new SomeClassUpdater("kafka.topics.table", maxNumMatches), "kafka.topics.table");
Properties streamsConfiguration = loadConfiguration();
KafkaStreams streams = new KafkaStreams(streamsBuilder.build(), streamsConfiguration);
streams.start()
Why do I need the local state store, since I'm not doing any other computation with it and the data is also stored in the kafka changelog? Also at what moment does it store in the local store, does it store and commit to the changelog?
The problem that I'm facing is that I'm storing localy and in time I run into memory problems especially when it repartitions often. Because the old partitions still sit around and fill the memory. So my questions are, why do we need the persistence with rocksdb since:
- the data is persisted in kafka changelog
- ramdisk is gone anyway when the container is gone.