Questions tagged [flink-batch]
158 questions
0
votes
1 answer
Flink forward files from List filePaths
We have a list of filepaths from a DB table with a timestamp on when it is created. Trying to figureout how we can use the filepath list from db to forward only those files from nfs to kafka sink.
Right now I am using customized version of…

VSK
- 359
- 2
- 5
- 20
0
votes
1 answer
FLINK- how to process logic on sql query result
My requirement is to process or build some logic around the result of sql query in flink. For simplicity lets say I have two sql query they are running on different window size and one event stream. My question is
a) how I will know for which query…

Ashutosh
- 33
- 8
0
votes
1 answer
How does TM recovery handle past broadcasted data
In the context of HA of TaskManagers(TM), when a TM goes down a new one will be restored from latest checkpoint of faulted by the JobManager(JM).
Say we have 3 TMs (tm1, tm2, & tm3)
At a give time t where everyone's checkpoint(cp) is at cp1. All TMs…

ardhani
- 303
- 1
- 11
0
votes
1 answer
FLINK- Load historical data and maintain window of 30 days
My requirement is to hold 30 days data into stream to given any day for processing. so first day when FLINK application will start, it will fetch 30 days data from database and will merge to current stream data.
My challenge is - manage 30 days data…

Ashutosh
- 33
- 8
0
votes
1 answer
Not able to sleep in custom source funtion in Apache Flink which is union with other source
I have two sources, one is Kafka source and one is the custom source, I need to make a sleep custom source for one hour but I am getting below interruption.
java.lang.InterruptedException: sleep interrupted
at java.lang.Thread.sleep(Native…

patel akash
- 75
- 8
0
votes
1 answer
Flink combination of windowByTime and triggerByCount
source.keyBy(0)
.window(TumblingEventTimeWindows.of(Time.seconds(5)))
.trigger(PurgingTrigger.of(CountTrigger.of[TimeWindow](2)))
.process(new TestFun())
Explanation:
Let's say I have 3 events[E1, E2, E3], which should be trigger by…

Tulasi
- 79
- 1
- 9
0
votes
1 answer
Flink requires local path for hive conf directory but how to give that path if we are submitting flink job on yarn?
https://ci.apache.org/projects/flink/flink-docs-stable/dev/table/hive/#connecting-to-hive
According to this link, Flink requires local hive conf folder path but I need to submit the Flink job at yarn so Flink try to find path in yarn container e.g.…

patel akash
- 75
- 8
0
votes
1 answer
How to read an Excel file in Apache Flink?
Could someone explain how to load Excel data into Apache Flink?. I have seen in the API Doc other kind of formats such as txt, csv but not Excel.
Thanks in advance.

toni_92
- 1
- 2
0
votes
1 answer
Flink hadoop implementation problem - Could not find a file system implementation for scheme 'hdfs'
I'm struggling with integration hdfs to flink.
Scala binary version: 2.12,
Flink (cluster) version: 1.10.1
here is HADOOP_CONF_DIR;
and configuration of hdfs is here;
This configuration and HADOOP_CONF_DIR also the same in the taskmanager as…

Aykut Demirel
- 1
- 1
- 1
0
votes
1 answer
filter by max in tuple field in Apache Flink
I'm using the Apache Flink Streaming API through to process a data file and I'm interested in getting only the results from the last of the windows. Is there a way to do this? If it is not possible, I thought I could filter through the maximum of…

ekth0r
- 65
- 5
0
votes
1 answer
How to add new rows to an Apache Flink Table
Is it possible to add a new record/row to a flink table? For example i have the following table configuration:
ExecutionEnvironment env = TableEnvironmentLoader.getExecutionEnvironment();
BatchTableEnvironment tableEnv =…

soCalJack23
- 5
- 5
0
votes
2 answers
Does the flink 1.7.2 dataset not support kafka sink?
Does the flink 1.7.2 dataset not support kafka sink ?
After doing the batch operation I need to publish the message to kafka, meaning source is my postgres and sink is my kafka.
Is it possible ?

MadProgrammer
- 513
- 5
- 18
0
votes
1 answer
Flink Java API - Pojo Type to Tuple Datatype
I am creating a small utility on JAVA flink API to learn the functionalities. I am trying to read csv file and just print it and I have developed a POJO class for the structure of the data. When I executed the code, I dont see the right…

Karthi
- 708
- 1
- 19
- 38
0
votes
1 answer
why is it bad to execute Flink job with parallelism = 1?
I'm trying to understand what are the important features I need to take into consideration before submitting a Flink job.
My question is what is the number of parallelism, is there an upper bound(physically)? and how can the parallelism impact the…

Maher Marwani
- 43
- 1
- 5
0
votes
2 answers
Performance difference when doing a serialization between type of object vs statically typed
Do we need to statically type/declare the data type of variable when is going to be serialized? Does it improve any performance while serializing?
I'm creating a flink project for batch processing. I wrote a custom input reader, which is going to…

Keshore Durairaj
- 27
- 5