Questions tagged [apache-beam-kafkaio]
61 questions
0
votes
0 answers
How to create a KafkaRecord in Apache Beam Manually for Unit Tests
I'm doing an Apache Beam-based implementation, and data is taken from a Kafka stream into the pipeline through a KafkaIO. After reading the data, I have a few PTranforms to process the input data and I need to unit test the first PTranform that…

Prasad
- 83
- 1
- 8
0
votes
1 answer
Disable Direct Runner Logs in Apache Beam for Kafka Consumer
Seen a similar question asked, but on dataflow logging and not direct logging.
Basically, I want to turn off the wave of KafkaIO read (consumer) logs. I have tried setting the logging levels in SDK harness as follows.
var kafkasLogs =
…

Lemon
- 43
- 6
0
votes
1 answer
Apache Beam KafkaIO Reader & Writer - Error handling and Retry mechanism
I'm working on an Apache Beam Pipeline-based implementation and I consume data from a Kafka stream. After doing some processing I need to publish the processed data into three different Kafka topics. As the runner, I use Apache Flink.
My question…

Prasad
- 83
- 1
- 8
0
votes
3 answers
Apache Beam KafkaIO - Write to Multiple Topics
Currently, I'm working on Apache Beam Pipeline implementation which consumes data from three different Kafka topics, and after some processing, I create three types of objects adding those data taken from the above-mentioned Kafka topics. Finally,…

Prasad
- 83
- 1
- 8
0
votes
0 answers
avro schema parser ignoring logical type for byte type
I am trying to parse avro schema string to Schema object using avro lib..when parsing the parser seems ignoring the logical type provided in the avro schema and causing the deserilization of json data not to work properly.
Sample avro schema (json…

vkt
- 1,401
- 2
- 20
- 46
0
votes
1 answer
Send Big query table rows to Kafka avro message using apache beam
I need to publish the Big query table rows to Kafka in Avro format.
PCollection rows =
pipeline
.apply(
"Read from BigQuery query",
…

vkt
- 1,401
- 2
- 20
- 46
0
votes
0 answers
Google Dataflow with "Workflow failed"
I'm working on simple beam dataflow in JAVA on google cloud platform. I've tested locally and the pipeline is running well.
When i deploy on dataflow, i got this looping error :
{
insertId: "10mkpdlb7i"
labels: {4}
logName:…

shinjie
- 1
- 1
0
votes
1 answer
Apache Beam Pipeline KafkaIO - Commit offset manually
I have a Beam pipeline to consume streaming events with multiple stages (PTransforms) to process them. See the following code,
pipeline.apply("Read Data from Stream", StreamReader.read())
.apply("Decode event and extract relevant…

Prasad
- 83
- 1
- 8
0
votes
1 answer
Failing Apache Beam Pipeline when consuming events through KafkaIO on Flink runner
I have a beam pipeline with several stages that consumes data through a KafkaIO and the code looks like below,
pipeline.apply("Read Data from Stream", StreamReader.read())
.apply("Decode event and extract relevant fields", ParDo.of(new…

Prasad
- 83
- 1
- 8
0
votes
2 answers
How to consume Avro Serialized messages from AWS MSK via Apache Beam
PCollection> kafkaRecordPCollection =
pipeline.apply(
KafkaIO.read()
.withBootstrapServers("bootstrap-server")
.withTopic("topic")
…
0
votes
0 answers
Apache Beam KafkaIO - Set truststore file(jks) location in Kafka consumer properties
I am running Apache Beam Java APP in Spark Client mode using Yarn.
On Spark submit, the jks file is getting copied to working directory of the Spark executors.
But the reference to this path in Apache Beam KafkaIO config parameter is not…

Kartik
- 39
- 4
0
votes
1 answer
Apache Beam WriteToKafka (python SDK) doesn't write to topic (no manifest of error)
I am trying to write a stream to a Kafka Topic using WriteToKafka class of apache Beam (python SDK). However it runs the script endlessly (without error) and doesn't write stream to the topic. I have to cancel run, it doesn't stop, it doesn't give…

akurmustafa
- 122
- 10
0
votes
1 answer
How to expose Kafka metrics using KafkaIO Beam in python?
Want to be able to expose my consumer and producer metrics in a python written Beam pipeline that uses the KafkaIO library. Examples of the metrics I mean are the ones that you get from the python confluent-kafka library…

RMCP
- 11
- 2
0
votes
1 answer
Apache Beam Issue with Spark Runner while using Kafka IO
I am trying to test KafkaIO for the Apache Beam Code with a Spark Runner.
The code works fine with a Direct Runner.
However, if I add below codeline it throws error:
options.setRunner(SparkRunner.class);
Error:
ERROR…

Rahul Dawn
- 19
- 5
0
votes
1 answer
How can I simulate event lateness in Apache Beam reading from a Kafka Source
I am trying to tweak my windowing parameter in my streaming Beam pipeline. The parameters that I am modifying are withAllowedLateness, triggers, interval, pane-firing, etc.
However I don't know how to trigger lateness in my Kafka consuming pipeline…

Fabio
- 555
- 3
- 9
- 24