Questions tagged [apache-kafka-streams]

Related to Apache Kafka's built-in stream processing engine called Kafka Streams, which is a Java library for building distributed stream processing apps using Apache Kafka.

Kafka Streams is a Java library for building fault-tolerant distributed stream processing applications using streams of data records from topics in Apache Kafka.

Kafka Streams is a library for building streaming applications, specifically applications that transform input Kafka topics into output Kafka topics (or calls to external services, or updates to databases, or whatever). It lets you do this with concise code in a way that is distributed and fault-tolerant.

Documentation: https://kafka.apache.org/documentation/streams/

3924 questions
9
votes
1 answer

Kafka Stream to sort messages based on timestamp key in json message

I am publishing Kafka with JSON messages, eg: "UserID":111,"UpdateTime":06-13-2018 12:13:43.200Z,"Comments":2,"Like":10 "UserID":111,"UpdateTime":06-13-2018 12:13:40.200Z,"Comments":0,"Like":6 "UserID":222,"UpdateTime":06-13-2018…
Swati
  • 535
  • 11
  • 25
9
votes
3 answers

Kafka Stream with Avro in JAVA , schema.registry.url" which has no default value

I have the following configuration for my Kafka Stream application Properties config = new Properties(); config.put(StreamsConfig.APPLICATION_ID_CONFIG,this.applicaionId); …
9
votes
1 answer

How to always consume from latest offset in kafka-streams

Our requirement is such that if a kafka-stream app is consuming a partition, it should start it's consumption from latest offset of that partition. This seems like do-able using streamsConfiguration.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG,…
Saloni Vithalani
  • 323
  • 1
  • 4
  • 13
9
votes
2 answers

How to print KStream to console?

I have created a Kafka Topic and pushed a message to it. So bin/kafka-console-consumer --bootstrap-server abc.xyz.com:9092 --topic myTopic --from-beginning --property print.key=true --property key.separator="-" prints key1-customer1 on the…
Vicky
  • 16,679
  • 54
  • 139
  • 232
9
votes
1 answer

RecordTooLargeException in Kafka streams join

I have a KStream x KStream join which is breaking down with the following exception. Exception in thread “my-clicks-and-recs-join-streams-4c903fb1-5938-4919-9c56-2c8043b86986-StreamThread-1" org.apache.kafka.streams.errors.StreamsException:…
Nik
  • 5,515
  • 14
  • 49
  • 75
9
votes
1 answer

Kafka streams.allMetadata() method returns empty list

So I am trying to get interactive queries working with Kafka streams. I have Zookeeper and Kafka running locally (on windows). Where I use the C:\temp as the storage folder, for both Zookeeper and Kafka. I have setup the topic like…
9
votes
6 answers

UnsatisfiedLinkError on Lib rocks DB dll when developing with Kafka Streams

I'm writing a Kafka Streams application on my development Windows machine. If I try to use the leftJoin and branch features of Kafka Streams I get the error below when executing the jar application: Exception in thread "StreamThread-1"…
gvdm
  • 3,006
  • 5
  • 35
  • 73
9
votes
1 answer

External system queries during Kafka Stream processing

I'm trying to design a streaming architecture for streaming analytics. Requirements: RT and NRT streaming data input Stream processors implementing some financial analysis RT and NRT analysis output stream Reference data requests during stream…
A. Mariani
  • 93
  • 1
  • 4
9
votes
7 answers

Test Kafka Streams topology

I'm searching a way to test a Kafka Streams application. So that I can define the input events and the test suite shows me the output. Is this possible without a real Kafka setup?
imehl
  • 789
  • 1
  • 9
  • 25
9
votes
2 answers

How to manage Kafka KStream to Kstream windowed join?

Based on apache Kafka docs KStream-to-KStream Joins are always windowed joins, my question is how can I control the size of the window? Is it the same size for keeping the data on the topic? Or for example, we can keep data for 1 month but join the…
Am1rr3zA
  • 7,115
  • 18
  • 83
  • 125
9
votes
4 answers

KStreams + Spark Streaming + Machine Learning

I'm doing a POC for running Machine Learning algorithm on stream of data. My initial idea was to take data, use Spark Streaming --> Aggregate Data from several tables --> run MLLib on Stream of Data --> Produce Output. But I cam across KStreams.…
9
votes
1 answer

How to filter keys and value with a Processor using Kafka Stream DSL

I have a Processor that interact with a StateStore to filter and do complex logic on the messages. In the process(key,value) method I use context.forward(key,value) to send the keys and values that I need. For debugging purposes I also print…
Tony
  • 1,214
  • 14
  • 18
9
votes
2 answers

Stop a Kafka Streams app

Is it possible to have a Kafka Streams app that runs through all the data in a topic and then exits? Example I'm producing data into topics based on date. The consumer gets kicked off by cron, runs through all the available data and then .. does…
ethrbunny
  • 10,379
  • 9
  • 69
  • 131
9
votes
2 answers

Delete unused kafka consumer group

I'm using Apache Kafka 0.10 with a compacted topic as a distributed cache synch mechanism. When the application starts up it generates an instance specific consumer group id. As instances are added and removed for horizontal scalability, obviously…
George Smith
  • 1,027
  • 4
  • 11
  • 23
8
votes
0 answers

How to suppress window using wall clock time instead of event time in Kafka streams?

The requirement is to send alerts if the expected 'final' event (identified from the evenType field in event payload) is not received for a key within a time window of say 2 minutes on the input topic. I tried using suppress() as follows: events …
AK4647
  • 1,371
  • 1
  • 10
  • 11