Questions tagged [flink-batch]
158 questions
0
votes
0 answers
How to use flink CLI on windows using their docker approach
I have deployed a k8s local deployment of flink on minikube by using their helm chart: helm install -n flink riskfocus/flink --generate-name After that, on my PC I opened localhost:8081/ and I do see the UI as it should, with 4 available task…

Eugenio.Gastelum96
- 164
- 1
- 13
0
votes
1 answer
Flink- force Checkpoint
Currently, flink application is configured and implemented to create
avro files on every checkpoint.
Is is possible to force the flink application to create avro file
on-demand, instead of configurable time interval.
Is there any REST APIs or any…

Hareesh
- 41
- 4
0
votes
0 answers
Based on which preferences will a Job Manager in Flink decide to schedule a task on a specific task manager or task slot?
I am trying to understand the mechanism that Flink or to be more specific the job manager follows to deploy tasks on a task manager or a task slot, so i will try to explain my self in three questions :
1- does the job manager deals directly with the…

Mahmoud
- 13
- 3
0
votes
0 answers
How to submit Flink batch job requests per customer on Amazon EKS using S3 buckets for source and sink?
I am new to Kubernetes and Flink for some batch processing. I'd like to setup a Flink Job on EKS and I have about 2.5 TB of data that needs some aggregations performed every 30 minutes. (Overall, intend to process 120 TB of data per day from…

sbnukala
- 33
- 3
0
votes
0 answers
flink Lookup join with unbounded stream and bounded hive table not working
i'm looking for sample working code snippet for lookup join of a stream with a hive table in flink. using 1.16 version. The flink documentation provides examples with creating a new hive table and a stream table backed by a kafka connector.…

user3415433
- 1
- 2
0
votes
0 answers
Flink Different checkpoints for different pipelines
I have a use case where I want to run 2 independent processing flows on Flink. The two flows will have a high parallelism. So 2 flows would look like
Source1 -> operator1 -> Sink1
Source2 -> operator2 -> Sink2
I setup the above 2 flows as 2…

Vicky
- 2,999
- 2
- 21
- 36
0
votes
0 answers
Apache Flink: Pyflink job erroring trying to consume from Kakfka using new Flink KafkaSource API
I'm trying to consumer from Kafka topic using Flink Datastream Kafka connector, described in the official documentation [here][1]
I'm using pyflink for Python, and running very simple example which looks like this :
from pyflink.common import…

Matar
- 73
- 1
- 7
0
votes
0 answers
An error occurs when flink writes data to hive insert overwrite
I needed to set up an offline repository, so I used flink and hive. Error was reported when I used partition overwrite insert into hive.
com.py.project.tproc.data.common.exception.BigDataRuntimeException: Streaming mode not support overwrite.
The…
0
votes
1 answer
Use system built-in function in Flink Table api
I'm struggling to use the system built-in functions in flink table api. In particular the ROW_NUMBER() function. I saw some examples but all of them were in Flink SQL and I'm looking for table api syntax. I read that there is no any table api…

JoeHills
- 43
- 4
0
votes
1 answer
Writing rdbms data to s3 bucket using flink or pyflink
If this kind of error occur while writing data to s3 bucket using flink and pyflink:
ERROR] Could not execute SQL statement. Reason:
org.apache.flink.util.SerializedThrowable: The AWS Access Key Id you provided does not exist in our records.…

Vikas Duvedi
- 1
- 1
0
votes
0 answers
writing postgres table records to s3 using flink
If this kind of error occur while executing insert statement on flink( like trying to ingest rdbms data to s3 Inmy case I was trying to write from postgres to s3 buckte using flink)
[ERROR] Could not execute SQL statement.…

Vikas Duvedi
- 1
- 1
0
votes
1 answer
How to get insights about which data is in each slot or Operator instance?
I’m trying to get insights about the data inside each Slot in flink to understand how exactly the data is distributed
But it’s realy confusing for me To know where exactly to look.
I am working with a word counting example with a small text file, I…

Mahmoud
- 13
- 3
0
votes
0 answers
java.util.ServiceConfigurationError: io.grpc.NameResolverProvider: Provider io.grpc.netty.shaded.io.grpc.netty.UdsNameResolverProvider not a subtype
While submitting the flink job on the data proc cluster getting java.util.ServiceConfigurationError: io.grpc.NameResolverProvider: Provider io.grpc.netty.shaded.io.grpc.netty.UdsNameResolverProvider not a subtype exception. We are trying to read the…

Nagesh B Viswanadham
- 27
- 3
0
votes
0 answers
Flink job No Class Found exeception even after providing the depencies
I am not able to connect to pubsub from the flink job running on the Dataproc cluster.
Please find the code which I am using to connect to the Pubsub
{
StreamExecutionEnvironment streamExecEnv =…

Nagesh B Viswanadham
- 27
- 3
0
votes
0 answers
How to print read messages from pubsub in the flink
While reading the messages from the pubsub with flink code I am not able to print them on the console. Where do I find the read messages. Please find the code base as well as the output on the command line interface
public class…

Nagesh B Viswanadham
- 27
- 3