Questions tagged [flink-streaming]

Apache Flink is an open source platform for scalable batch and stream data processing. Flink supports batch and streaming analytics, in one system. Analytical programs can be written in concise and elegant APIs in Java and Scala.

Flink's streaming API provides rich semantics, including processing- and event-time windows, as well as stateful UDFs. Flink streaming uses a light-weight fault-tolerance mechanism with exactly-once processing guarantees.

Learn more about Apache Flink at the project website: https://flink.apache.org/

3185 questions
0
votes
0 answers

Parquet file sink to azure blob

I have implemented the following sink: object ParquetSink { def parquetFileSink[A <: Message: ClassTag]( assigner: A => String, config: Config )(implicit lc: LoggingConfigs): FileSink[A] = { val bucketAssigner = new…
Eli Golin
  • 373
  • 1
  • 16
0
votes
0 answers

How can I trigger my window with 1 hour latency in apache flink

I'm working with Apache Flink 1.16.0 and I'm trying to trigger my window with a latency. I tried to use a Custom Trigger for that, but it doesn't work. Here is my custom trigger: public class TriggerWithLatency extends Trigger
emce
  • 11
  • 2
0
votes
1 answer

Flink restarts - how to go through main function again

I'm running a flink kinesis analytics application. I've seen that using checkpoints is problematic for us because it causes duplicates (as indeed specified in flink documentation - "data that has been written since that checkpoint will be written to…
L.S
  • 192
  • 1
  • 6
0
votes
0 answers

How can I continuously read data from Apache Kudu in real-time using Apache Flink?

I need to read data with Apache Flink from Apache Kudu Database in realtime. My use-case is: I receive a message from Kafka, deserialize that message and get an ID. If ID exists is in the database, I ignore it If isn't, I need to add it in…
0
votes
1 answer

How to figure out what's the size of ListState?

In my Flink project, there are two topics that I join query on both of them: StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment(); StreamTableEnvironment tableEnv = StreamTableEnvironment.create(env); …
newbie5050
  • 51
  • 6
0
votes
1 answer

ClassCastException while Flink run : cannot assign instance of java.util.LinkedHashMap to field org.apache.flink.runtime.jobgraph.JobVertex.results

Error when run Flink job: ClassCastException: cannot assign instance of java.util.LinkedHashMap to field org.apache.flink.runtime.jobgraph.JobVertex.results of type java.util.ArrayList in instance of…
0
votes
1 answer

Consuming a json from kafka on flink

package org.example; import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment; import org.apache.flink.connector.kafka.source.KafkaSource; import…
0
votes
0 answers

How to access node's file from FlinkSessionJob?

Recently installed Flink kubernetes operator in linux system. I created yaml file for FlinkDeployment and FlinkSessionJob. FlinkDeployment was created successfully but FlinkSessionJob cann't be created because of not able to access jarURI path. Here…
0
votes
0 answers

flink Lookup join with unbounded stream and bounded hive table not working

i'm looking for sample working code snippet for lookup join of a stream with a hive table in flink. using 1.16 version. The flink documentation provides examples with creating a new hive table and a stream table backed by a kafka connector.…
0
votes
1 answer

Share client object among custom Flink KeyedProcessFunction's

I have an object that is a client for an external service that I am grabbing data off of to join with another Kafka source which is an operator in the Flink topology. What are some ways or tricks so that I don't have to create a new client within…
Baiqing
  • 1,223
  • 2
  • 9
  • 21
0
votes
0 answers

Records not showing in Flink when watermarking is activated

Trying to run the a flink application locally with a local Kinesis stream. The following code works perfectly (as in, records can be seen in the sink table path), but when I change the watermark from event_ts to event_ts - INTERVAL '10' SECOND, the…
0
votes
0 answers

Flink Different checkpoints for different pipelines

I have a use case where I want to run 2 independent processing flows on Flink. The two flows will have a high parallelism. So 2 flows would look like Source1 -> operator1 -> Sink1 Source2 -> operator2 -> Sink2 I setup the above 2 flows as 2…
Vicky
  • 2,999
  • 2
  • 21
  • 36
0
votes
1 answer

Flink sink operator with multiple upstream operators

I want a sink operator to be the downstream for multiple upstream process function operators, however, I want the sink operator to be conscious of which upstream operator the data is coming from so it can add a special tag. I don't want to include…
Baiqing
  • 1,223
  • 2
  • 9
  • 21
0
votes
0 answers

Apache Flink: Pyflink job erroring trying to consume from Kakfka using new Flink KafkaSource API

I'm trying to consumer from Kafka topic using Flink Datastream Kafka connector, described in the official documentation [here][1] I'm using pyflink for Python, and running very simple example which looks like this : from pyflink.common import…
0
votes
0 answers

How to pass the TLS certs to the latest version of flink-connector-pulsar

I am trying to connect my Flink application to a Pulsar topic for ingesting data. The topic is active and i am able to ingest the data via a normal Java application. When i try to use the Flink application to ingest the data from the same topic,…
1 2 3
99
100