Kafka KTable per-topic overhead

Question

I have a frequently changing and large amount of objects, and I need to maintain kind of state table for each object. I'm considering to use a KTable for each object, but I'm kind of worried about the overhead that this structure would bring with it.

In that sense, what is the expected "overhead" of a KTable, when each object would get it's own topic and table, where objects would otherwise not get each its own topic? For example, how much memory does a topic with KTable consume?

To make an example (these are not actual numbers, just the relativen numbers is similar to what I'm looking for):

1M objects, each object has a topic with one partition
20 producers, 20 consumers
message size 1kb
update rate 100k messages per second

There's no way to answer this without knowing how large each binary event is. Creating a table alone doesn't affect memory of the brokers or topics, and they can/should be written to disk rather than maintained completely in memory — OneCricketeer, Feb 23 '22 at 15:33
I mean things like replica.fetch.max.bytes wich is 1MB per default. There must be some kind of overhead known per topic / partition I guess, no? https://docs.confluent.io/platform/current/installation/configuration/broker-configs.html — benjist, Feb 23 '22 at 16:07
There's network overhead with any consumer, not memory overhead — OneCricketeer, Feb 23 '22 at 16:08
According to this site, there is a memory overhead per partition and topic: https://docs.cloudera.com/documentation/kafka/latest/topics/kafka_performance.html#concept_exp_hzk_br — benjist, Feb 23 '22 at 16:15
That should be obvious. The brokers maintain and track the partitions. That has nothing to do with tables or topic consumers. — OneCricketeer, Feb 23 '22 at 16:17
To clarify, `replica.fetch.max.bytes` is _between brokers_. Not for external clients — OneCricketeer, Feb 23 '22 at 16:19
I may have been unprecise. I also mean the broker side, not just the consumer. So, that would translate to 1000 topics (each with a single partition) = 1 GB on the broker side. — benjist, Feb 23 '22 at 16:33
A topic alone doesn't take 1MB of heap. Otherwise, the cluster's I've seen with several thousands of topics and only 6GB heap size wouldn't be running. The `fetch.max.bytes` are exclusively network buffer sizes, not statically allocated server heap spaces — OneCricketeer, Feb 23 '22 at 16:41
I dont really have a specific answer. Sticking with the comment that tables don't cause significant strain on any topic; at least, none beyond a regular consumer. Overall, I don't think Kafka can support a million topics [without KRaft mode](https://stackoverflow.com/a/32963227/2308683), which is not production ready yet. — OneCricketeer, Feb 23 '22 at 19:44

Kafka KTable per-topic overhead

0 Answers0