Questions tagged [flink-batch]

158 questions
0
votes
1 answer

Flink batch job wait for kafka offset commit interval

I have a Flink batch job which reads from kafka and writes to S3. The current strategy of this job is to read From: the committed offset in Kafka(if there is no committed offset, then read from the earliest offset) To: the latest offset at the start…
Vinod Mohanan
  • 3,729
  • 2
  • 17
  • 25
0
votes
0 answers

For bounded data, how do I get Flink to "trigger" once flatmap has finished outputting all its data

I've explicitly set "batch mode" in Flink's StreamExecutionEnvironmen settings, as I'm working with bounded data. The bounded data passes through a flatmap; and the flatmap is windowed using GlobalWindows. Since the data is bounded, there is a…
Jonathan Sylvester
  • 1,275
  • 10
  • 23
0
votes
1 answer

Is there a method to setup flink memory dynamically?

When using Apache Flink we can configure values in flink-conf.yaml. But here using CLI commands we can assign some values Dynamically when starting or submitting a job or a task in flink. eg:- bin/taskmanager.sh start-foreground…
Shan
  • 31
  • 4
0
votes
1 answer

Flink table api filesystem connector is not available

I'm trying to write to my local filesystem with flink table api (1.15.1 version). I'm using TableDescriptor.forConnector('filesystem') but I get the exception: Could not find any factory for identifier 'filesystem' that implements…
0
votes
0 answers

Building StreamExecutionEnvironment from a json

Is there a similar module to Storm's TopologyBuilder in Flink to build StreamExecutionEnvironment? Basically, I want to configure my source and sinks in a JSON file and I would like to build the StreamExecutionEnvironment seamlessly from the JSON…
Vicky
  • 26
  • 3
0
votes
1 answer

java.lang.NoClassDefFoundError: org/apache/flink/util/function/SerializableFunction

flink version:1.14.2 hive version:2.1.1 use flink connect hive exec batch process,happend err. use flink sql exec query java.lang.BootstrapMethodError: java.lang.NoClassDefFoundError: java.lang.BootstrapMethodError: java.lang.NoClassDefFoundError:…
0
votes
1 answer

Flink hive api query exception: Distinct without an aggregation

I have the following exception when querying Hive with Flink's Hive table API connector: Distinct without an aggregation. However, the sql query is executed correctly when using Hue interface to query hive. I'm wondering if this problem due to a bad…
0
votes
2 answers

flink tumbling window is not triggered (no watermark strategy)

Problem statement: stream events from kafka source. These event payloads are of string format. Parse them into Documents and batch insert them into DB every 5 seconds based on event time. map() functions are getting executed. But program control is…
AnV
  • 2,794
  • 3
  • 32
  • 43
0
votes
1 answer

Flink Rest API : /jars/upload returning 404

Following is my code snippet used for uploading Jar in Flink. I am getting 404 response for this post request. Following is the output for request. I also tried updating the url with /v1/jars/upload but same response. All the API related to jars is…
priyadhingra19
  • 333
  • 4
  • 15
0
votes
2 answers

Add Flink Job Jar in Docker Setup and run Job via Flink Rest API

We're running Flink in Cluster Session mode and automatically add Jars in the Dockerfile: ADD pipeline-fat.jar /opt/flink/usrlib/pipeline-fat.jar So that we can run this Jar via the Flink Rest API without the need to upload the Jar in advance: POST…
hb0
  • 3,350
  • 3
  • 30
  • 48
0
votes
1 answer

How can I sort data by fields in record in Batch-Execution mode using Flink DataStream api?

I need to write a batch flink job and I prefer to use DataStream api. I want to make the output of the job sorted wihtin each output file(HDFS file). So that downstream can realy on these to speed up indexing. Is there any replacement for…
Shenjiaqi
  • 11
  • 2
0
votes
1 answer

if it's possible to run batch processing on dynamic table in flink

Currently I run multiple varient structured ETL job on the same table by the following steps: sync data from RDBMS to data warehouse continuously. run multiple ETL at different time(data in data warehouse at corresponding timepoint). If it's…
0
votes
1 answer

How to setup two different process functions for single flink job config?

I have a single flink job with 3 different inputs (optional) and the same output will be emitted for each type of input. input1 uses KeyedProcessFunction() input2 uses ProcessWindowFuction() Basically, the job input is a union of three inputs and…
0
votes
2 answers

Flink 1.12.x DataSet --> Flink 1.14.x DataStream

I am trying to migrate from Flink 1.12.x DataSet api to Flink 1.14.x DataStream api. mapPartition is not available in Flink DataStream. Our Code using Flink 1.12.x DataSet dataset . .mapPartition(new SomeMapParitionFn()) …
Saravanan
  • 19
  • 4
0
votes
1 answer

Flink Sorting A Global Window On A Bounded Stream

I've built a flink application to consume data directly from Kafka but in the event of a system failure or a need to re-process this data, I need to instead consume the data from a series of files in S3. The order in which messages are processed is…
john
  • 709
  • 3
  • 13
  • 25