Questions tagged [flink-batch]
158 questions
0
votes
1 answer
Flink batch job wait for kafka offset commit interval
I have a Flink batch job which reads from kafka and writes to S3. The current strategy of this job is to read
From: the committed offset in Kafka(if there is no committed offset, then read from the earliest offset)
To: the latest offset at the start…

Vinod Mohanan
- 3,729
- 2
- 17
- 25
0
votes
0 answers
For bounded data, how do I get Flink to "trigger" once flatmap has finished outputting all its data
I've explicitly set "batch mode" in Flink's StreamExecutionEnvironmen settings, as I'm working with bounded data.
The bounded data passes through a flatmap; and the flatmap is windowed using GlobalWindows. Since the data is bounded, there is a…

Jonathan Sylvester
- 1,275
- 10
- 23
0
votes
1 answer
Is there a method to setup flink memory dynamically?
When using Apache Flink we can configure values in flink-conf.yaml. But here using CLI commands we can assign some values Dynamically when starting or submitting a job or a task in flink.
eg:- bin/taskmanager.sh start-foreground…

Shan
- 31
- 4
0
votes
1 answer
Flink table api filesystem connector is not available
I'm trying to write to my local filesystem with flink table api (1.15.1 version). I'm using TableDescriptor.forConnector('filesystem') but I get the exception:
Could not find any factory for identifier 'filesystem' that implements…

JoeHills
- 43
- 4
0
votes
0 answers
Building StreamExecutionEnvironment from a json
Is there a similar module to Storm's TopologyBuilder in Flink to build StreamExecutionEnvironment?
Basically, I want to configure my source and sinks in a JSON file and I would like to build the StreamExecutionEnvironment seamlessly from the JSON…

Vicky
- 26
- 3
0
votes
1 answer
java.lang.NoClassDefFoundError: org/apache/flink/util/function/SerializableFunction
flink version:1.14.2
hive version:2.1.1
use flink connect hive exec batch process,happend err.
use flink sql exec query
java.lang.BootstrapMethodError: java.lang.NoClassDefFoundError:
java.lang.BootstrapMethodError: java.lang.NoClassDefFoundError:…
0
votes
1 answer
Flink hive api query exception: Distinct without an aggregation
I have the following exception when querying Hive with Flink's Hive table API connector: Distinct without an aggregation.
However, the sql query is executed correctly when using Hue interface to query hive.
I'm wondering if this problem due to a bad…
0
votes
2 answers
flink tumbling window is not triggered (no watermark strategy)
Problem statement: stream events from kafka source. These event payloads are of string format. Parse them into Documents and batch insert them into DB every 5 seconds based on event time.
map() functions are getting executed. But program control is…

AnV
- 2,794
- 3
- 32
- 43
0
votes
1 answer
Flink Rest API : /jars/upload returning 404
Following is my code snippet used for uploading Jar in Flink. I am getting 404 response for this post request. Following is the output for request. I also tried updating the url with /v1/jars/upload but same response. All the API related to jars is…

priyadhingra19
- 333
- 4
- 15
0
votes
2 answers
Add Flink Job Jar in Docker Setup and run Job via Flink Rest API
We're running Flink in Cluster Session mode and automatically add Jars in the Dockerfile:
ADD pipeline-fat.jar /opt/flink/usrlib/pipeline-fat.jar
So that we can run this Jar via the Flink Rest API without the need to upload the Jar in advance:
POST…

hb0
- 3,350
- 3
- 30
- 48
0
votes
1 answer
How can I sort data by fields in record in Batch-Execution mode using Flink DataStream api?
I need to write a batch flink job and I prefer to use DataStream api.
I want to make the output of the job sorted wihtin each output file(HDFS file). So that downstream can realy on these to speed up indexing.
Is there any replacement for…

Shenjiaqi
- 11
- 2
0
votes
1 answer
if it's possible to run batch processing on dynamic table in flink
Currently I run multiple varient structured ETL job on the same table by the following steps:
sync data from RDBMS to data warehouse continuously.
run multiple ETL at different time(data in data warehouse at corresponding timepoint).
If it's…

BAKE ZQ
- 765
- 6
- 22
0
votes
1 answer
How to setup two different process functions for single flink job config?
I have a single flink job with 3 different inputs (optional) and the same output will be emitted for each type of input.
input1 uses KeyedProcessFunction()
input2 uses ProcessWindowFuction()
Basically, the job input is a union of three inputs and…
0
votes
2 answers
Flink 1.12.x DataSet --> Flink 1.14.x DataStream
I am trying to migrate from Flink 1.12.x DataSet api to Flink 1.14.x DataStream api. mapPartition is not available in Flink DataStream.
Our Code using Flink 1.12.x DataSet
dataset
.
.mapPartition(new SomeMapParitionFn())
…

Saravanan
- 19
- 4
0
votes
1 answer
Flink Sorting A Global Window On A Bounded Stream
I've built a flink application to consume data directly from Kafka but in the event of a system failure or a need to re-process this data, I need to instead consume the data from a series of files in S3. The order in which messages are processed is…

john
- 709
- 3
- 13
- 25