Questions tagged [flink-batch]
158 questions
0
votes
1 answer
flink events are coming to jobmanager but not to taskmanager in cluster
I am trying to run flink application on cluster. Application deployed successfully and I can see jobmanger and taskmanager are running and resource registration is done successfully.
Application need dummy event and it is working fine and sql query…

Ashutosh
- 33
- 8
0
votes
1 answer
How to detect a Flink Batch Job finishes
Currently, I have a streaming job which is firing a batch job when it receives a specific trigger.
I want to follow that fired batch job and when it finishes, want to insert an entry to a database like elastic search or so.
Any ideas, how we can…

ukgaudram
- 85
- 8
0
votes
1 answer
Any equivalent feature of Spark RDD.persist(..) in Flink?
Spark RDD.persist(..) can help avoid the duplicated RDD evaluation.
Is there the same feature in Flink?
Actually I would like to know if I code like the following, Flink will evaluate dataStream once or twice?
val dataStream =…

Grant
- 500
- 1
- 5
- 18
0
votes
1 answer
How does Flinks Collector.collect() handle data?
Im trying to understand what Flinks Collector.collect() does and how it handles incoming/outgoing data:
Example taken from Flink DataSet API:
The following code transforms a DataSet of text lines into a DataSet of words:
DataSet output =…

tooobsias
- 39
- 3
0
votes
1 answer
How to Implement Patterns to Match Brute Force Login and Port Scanning Attacks using Flink CEP
I have a use case where a large no of logs will be consumed to the apache flink CEP. My use case is to find the brute force attack and port scanning attack. The challenge here is that while in ordinary CEP we compare the value against a constant…

JDForLife
- 91
- 2
- 10
0
votes
1 answer
Unbounded Collection based stream in Flink
Is it possible to create an unbounded collection streams in flink. Like in a map if we add a element flink should process as in the socket stream. It should not exit once the initial elements are read.

JDForLife
- 91
- 2
- 10
0
votes
1 answer
Set Flink Detached Mode using Java
Flink Cluster Details,
Number of nodes : 4
Flink Version : 1.11
Flink Client : RestCluserClient
We are submitting Flink batch job from streaming job using PackagedProgram, but our requirement is to execute only one job at a time, let's say we got…

Murtaza Zaveri
- 49
- 7
0
votes
0 answers
Apache Flink stream limit
I have quite a lot of bounded data that I need to filter after getting from DataSource (say, I'm unable to filter it instantly using query due to complex filtering logic). And I need to restrict maximum amount of data at the end of pipeline (to…

viator
- 1,413
- 3
- 14
- 25
0
votes
1 answer
Problem with using collect() function of DataSet in apache flink
I am trying to calculate the AdamicAdar index of relations in a Social Media following graph. I set up my Edges, Vertices, Dataset and Graph, using apache flink-gelly lirbrarie. Here is my code:
import…

amin rahman
- 39
- 10
0
votes
1 answer
Create Input Format of Elasticsearch using Flink Rich InputFormat
We are using Elasticsearch 6.8.4 and Flink 1.0.18.
We have an index with 1 shard and 1 replica in elasticsearch and I want to create the custom input format to read and write data in elasticsearch using apache Flink dataset API with more than 1…

Murtaza Zaveri
- 49
- 7
0
votes
1 answer
testing flink jobs with MiniCluster to trigger the timer using processing time
is there any way to control the processing time to trigger the timer when testing flink jobs with MiniClusterWithClientResource?
I'm able to test both the methods of the KeyedCoProcessFunction i.e. processElement()... triggering timer callback i.e…

Fernando Chong
- 31
- 3
0
votes
0 answers
Apache Flink Stateful Reading File From S3
I have a flink batch job that reads a very large parquet file from S3 then it sinks a json into Kafka topic.
The problem is how can I make the file reading process stateful? I mean whenever the job interrupted or crushed, the job should start from…

mstzn
- 2,881
- 3
- 25
- 37
0
votes
2 answers
Flink savepoint with local execution environment (like standalone application)
How can I implement flink savepoint with standalone application (local execution env or mini cluster).
I configured savepoint directory in flink-config.yaml file but not sure how to take the savepoint before shutdown the application and how to…

Ashutosh
- 33
- 8
0
votes
1 answer
Use of flink/kubernetes to replace etl jobs (on ssis) : one flink cluster per jobtype or create and destroy flink cluster per job execution
I am trying to see feasibility of replacing the hundreds of feed file ETL jobs created using SSIS packages with apache flink jobs (and kuberentes as underlying infra). One recommendation i saw in some article is "to use one flink cluster for one…

Vishal
- 11
- 2
0
votes
1 answer
Flink yarn-session mode is becoming unstable when running ~10 batch jobs at same time
I am trying to set up a flink-yarn session to run ~100+ batch jobs. After getting connected to ~40 task managers and ~10 jobs running (each task manager with 2 slots and 1GB memory each) it looks like the session becomes unstable. There were enough…

joss
- 695
- 1
- 5
- 16