Currently ~70 records/s will be processed on a single node with a single Kafka broker. The throughput is low, as is the CPU utilisation and the memory usage. My topology:
projekte
.leftJoin(wirtschaftseinheiten)
.leftJoin(mietobjekte)
.cogroup { _, current, previous: ProjektAggregat ->
previous.copy(
projekt = current.projekt,
wirtschaftseinheit = current.wirtschaftseinheit,
mietobjekt = current.mietobjekt,
projektErstelltAm = current.projektErstelltAm
)
}
.cogroup(projektstatus.groupByKey()) { _, projektstatusEvent, aggregat -> aggregat + projektstatusEvent }
.cogroup(befunde.groupByKey()) { _, befundAggregat, aggregat -> aggregat + befundAggregat }
.cogroup(aufgaben.groupByKey()) { _, aufgabeAggregat, aggregat -> aggregat + aufgabeAggregat }
.cogroup(durchfuehrungen.groupByKey()) { _, durchfuehrungAggregat, aggregat -> aggregat + durchfuehrungAggregat }
.cogroup(gruppen.groupByKey()) { _, gruppeAggregat, aggregat -> aggregat + gruppeAggregat }
.aggregate({ ProjektAggregat() }, Materialized.`as`(projektStoreSupplier))
I've tried to increase different size to feed more data to my stream:
- cache.max.bytes.buffering: 52428800
- max.request.size: 52428800 but they didn't measurably help.
How can I increase throughput to achieve optimal system utilisation?