Questions tagged [apache-kafka-streams]

Related to Apache Kafka's built-in stream processing engine called Kafka Streams, which is a Java library for building distributed stream processing apps using Apache Kafka.

Kafka Streams is a Java library for building fault-tolerant distributed stream processing applications using streams of data records from topics in Apache Kafka.

Kafka Streams is a library for building streaming applications, specifically applications that transform input Kafka topics into output Kafka topics (or calls to external services, or updates to databases, or whatever). It lets you do this with concise code in a way that is distributed and fault-tolerant.

Documentation: https://kafka.apache.org/documentation/streams/

3924 questions
1
vote
1 answer

Kafka Stream BoundedMemoryRocksDBConfig

I'm trying to understand how the internals of Kafka Streams works with respects to cache and RocksDB (state store). KTable, String> kTable = kStreamMapValues …
1
vote
0 answers

AVRO : Keeping union record field names when using union record type

I am using kafka-streams to transform xml messages to avro format. I would like to know if it is possible to keep the field names of my union records when using union type for records in my avro schema as in the example below so that instead of…
Raoul
  • 331
  • 1
  • 6
  • 7
1
vote
0 answers

Best practice for implementing Micronaut/Kafka-Streams with more than one KStream/KTable?

There are several details about the example Micronaut/Kafka Streams application which I don't understand. Here is the example class from the documentation (original link:…
Sparky
  • 2,694
  • 3
  • 21
  • 31
1
vote
1 answer

Kafka Streams Reduce vs Suppress

While reading up on the suppress() documentation, I saw that the time window will not advance unless records are being published to the topic, because it's based on event time. Right now, my code is outputting the final value for each key, because…
1
vote
1 answer

Kafka Streams application integrate with Kafka JDBC sink connector

I am trying to use kafka streams for some sort of computation, and send the result of computation to a topic which is sinked to database by JDBC sink connector. The result needs to be serialized using avro with confluent schema registry. Is there…
1
vote
0 answers

How to publish an event during Kafka stream aggregate?

I need to emit an event for the first time the aggregated value reaches 100, this is what it looks right now StreamsBuilder sb = new StreamsBuilder(); KTable example = sb.stream(inputs, Consumed.with(Serdes.Integer(),…
niukasu
  • 267
  • 2
  • 4
  • 8
1
vote
1 answer

Kafka Streams: event-time skew when processing messages from different partitions

Let's consider a topic with multiple partitions and messages written in event-time order without any particular partitioning scheme. Kafka Streams application does some transformations on these messages, then groups by some key, and then aggregates…
Boris Sukhinin
  • 110
  • 1
  • 7
1
vote
1 answer

Spring Boot: how to idiomatically configure Schema Registry Serdes in spring-kafka

Are there examples of configuring SpecificAvroSerdes (or any schema registry-based serdes - JsonSchema and Protobuf) in spring-kafka that allow leveraging some of the autoconfiguration (based on yaml or properties files). There are a few similar…
1
vote
0 answers

Kafka streams : How to scale out the input topic partitions seamlessly when the topology has state involved

We heavily use Kafka for our messaging needs(replaced MQ from our apps) and its one of the best decisions we made as it scales out easily as and when we need. We increase partitions when we know there will be new data coming in to the system. This…
1
vote
1 answer

How to limit the event consumption speed of kafka consumer so that service is not impacted

How to limit the event consumption speed of kafka consumer so that service is not impacted .Getting huge data in kafka Topic and i need to process all the events . We have 8 consumer reading from 12 partitions Getting huge data in kafka topic but…
1
vote
1 answer

How does Kafka handle a situation where retention period expires while a consumer offset is within the segment file?

I'm wanting to know how Kafka would handle this situation. A consumer has come across a poison pill message, and is not committing past it. No one notices for a long time (15 days). The retention period on the topic is (7 days). Let's say that…
1
vote
1 answer

Kafka Streams: Add Sequence to each message within a group of message

Set Up Kafka 2.5 Apache KStreams 2.4 Deployment to Openshift(Containerized) Objective Group a set of messages from a topic using a set of value attributes & assign a unique group identifier -- This can be achieved by using selectKey and…
mandev
  • 23
  • 1
  • 4
1
vote
1 answer

What is difference HIGH, MEDIUM, LOW Importance level in Apache kafka options?

I want to know difference about kafka option Importnace level. There is 3 levels in Apache kafka. org.apache.kafka.commong.config.ConfigDef.Importance public enum Importance { HIGH, MEDIUM, LOW } What's the difference between these three?
1
vote
0 answers

NullPointerException when forwarding a record in a Punctuator (kafka clients 2.5.0)

I am working on a Kafka stream application based on spring-boot and java 8. We use Kafka clients version 2.5.0 I noticed that sometimes (not always) when forwarding a record from a punctuator, the operation fails with a null pointer exception. Here…
filmac
  • 177
  • 2
  • 15
1
vote
1 answer

Which timezone is kafka consumer timestamp using and how to change it?

I'm producing data and comsuming it using Kafka. In the consumer, I print the consumed data using the following code: consumer = connect_kafka_consumer() for message in consumer: print (message) Here is the output of one…