Questions tagged [flink-streaming]

Apache Flink is an open source platform for scalable batch and stream data processing. Flink supports batch and streaming analytics, in one system. Analytical programs can be written in concise and elegant APIs in Java and Scala.

Flink's streaming API provides rich semantics, including processing- and event-time windows, as well as stateful UDFs. Flink streaming uses a light-weight fault-tolerance mechanism with exactly-once processing guarantees.

Learn more about Apache Flink at the project website: https://flink.apache.org/

3185 questions
10
votes
5 answers

How to stop a flink streaming job from program

I am trying to create a JUnit test for a Flink streaming job which writes data to a kafka topic and read data from the same kafka topic using FlinkKafkaProducer09 and FlinkKafkaConsumer09 respectively. I am passing a test data in the…
Mike
  • 123
  • 1
  • 8
10
votes
1 answer

Flink streaming event time window ordering

I'm running into some troubles understanding the semantics around event time windowing. The following program generates some tuples with timestamps that are used as event time and does a simple window aggregation. I would expect the output to be in…
bandrews
  • 101
  • 1
  • 5
9
votes
5 answers

Confused about FLINK task slot

I know a task manager could have several task slots. But, what is a task slot ? A JVM process or an object in memory or a thread?
Brutal_JL
  • 2,839
  • 2
  • 21
  • 27
8
votes
11 answers

No ExecutorFactory found to execute the application in Flink 1.11.1

first of all I have read this post about the same issue and tried to follow the same solution that works for him (create a new quickstart with mvn and migrate the code there) and is not working eighter when out-of-the-box of IntelliJ. Here is my…
Alter
  • 903
  • 1
  • 11
  • 27
8
votes
1 answer

Illegal reflective access by org.apache.flink.api.java.ClosureCleaner

When I run a SocketWindowWordCount Program in Apache flink, it shows a WARNING: Illegal reflective access by org.apache.flink.api.java.ClosureCleaner WARNING: An illegal reflective access operation has occurred WARNING: Illegal reflective…
8
votes
1 answer

Flink Job suddenly crashed with error: Encountered error while consuming partitions

I have a streaming job failed after running for 1day and 10 hours. One of the subtasks suddenly failed and crashed the whole job. Since I set up a restart_strategy, the job automatically restarted but crashed again with the same error. I found the…
Yue Liu
  • 131
  • 1
  • 8
8
votes
1 answer

Accessing Flink Classloader before Stream Start

In my project I would like to access the Flink User Classloader before the stream is executed. I have been instantiating my own Classloader to deserialize classes (doing my best to avoid issues related to multiple classloaders) prior to stream…
Luka Jurukovski
  • 126
  • 1
  • 2
8
votes
1 answer

Apache Flink: How often is state de/serialized?

How frequently does Flink de/serialise operator state? Per get/update or based on checkpoints? Does the state backend make a difference? I suspect that in the case of a keyed-stream with a diverse key (millions) and thousands of events per second…
Reza Same'ei
  • 819
  • 8
  • 22
8
votes
2 answers

Apache Flink: What's the difference between side outputs and split() in the DataStream API?

Apache Flink has a split API that lets to branch data-streams: val splited = datastream.split { i => i match { case i if ... => Seq("red", "blue") case _ => Seq("green") }} splited.select("green").flatMap { .... } It also provides a another…
Reza Same'ei
  • 819
  • 8
  • 22
8
votes
1 answer

Apache Flink: guideliness for setting parallelism?

I'm trying to get some simple rules or guidelines for what values to set for operator or job parallelism. It would seem to me that it should be a number <= the number of available task slots? For example, suppose I have 2 task manager machines,…
ChrisRTech
  • 547
  • 8
  • 25
8
votes
1 answer

Apache Flink: NullPointerException caused by TupleSerializer

When I execute my Flink application it gives me this NullPointerException: 2017-08-08 13:21:57,690 INFO com.datastax.driver.core.Cluster - New Cassandra host /127.0.0.1:9042 added 2017-08-08 13:22:02,427 INFO …
Zizou
  • 115
  • 1
  • 8
8
votes
2 answers

Difference between shuffle() and rebalance() in Apache Flink

I am working on my bachelor's final project, which is about the comparison between Apache Spark Streaming and Apache Flink (only streaming) and I have just arrived to "Physical partitioning" in Flink's documentation. The matter is that in this…
froblesmartin
  • 1,527
  • 18
  • 25
8
votes
1 answer

Apache Flink - custom java options are not recognized inside job

I've added the following line to flink-conf.yaml: env.java.opts: "-Ddy.props.path=/PATH/TO/PROPS/FILE" when starting jobmanager (jobmanager.sh start cluster) I see in logs that the jvm option is indeed recognized 2017-02-20 12:19:23,536 INFO …
OmriManor
  • 255
  • 4
  • 14
8
votes
1 answer

What is the difference between periodic and punctuated watermarks in Apache Flink?

Will be helpful if someone give usecase example to explain the difference between each of the Watermark API with Apache flink given below Periodic watermarks - AssignerWithPeriodicWatermarks[T] Punctuated Watermarks -…
Somasundaram Sekar
  • 5,244
  • 6
  • 43
  • 85
7
votes
1 answer

Flink Windows Boundaries, Watermark, Event Timestamp & Processing Time

Problem Definition & Establishing Concepts Let’s say we have a TumblingEventTimeWindow with size 5 minutes. And we have events containing 2 basic pieces of information: number event timestamp In this example, we kick off our Flink topology at…