Highest Voted 'pyflink' Questions

1

vote

0 answers

PyFlink: What's the best way to provide reference data and keep them uptated

I have a use case where I want to do comparisons between incoming data and some reference data provided by another service. What's the best way in pyflink to fetch those data and update them regularly (in intervals of 1-2 hours) Other…

python apache-flink pyflink

asked Jan 11 '23 at 14:28

Amir Afianian

2,679
4
22
46

1

vote

0 answers

Failed to deserialize Avro record : Getting ArrayIndexOutOfBoundsException

I am trying to read from Kafka with Avro format using Pyflink My Program is this : from pyflink.datastream import StreamExecutionEnvironment from pyflink.datastream.connectors.kafka import FlinkKafkaConsumer from pyflink.datastream.formats.avro…

apache-flink avro flink-streaming flink-sql pyflink

asked Dec 16 '22 at 13:41

mjennet

75
1
10

1

vote

1 answer

How to use a field in ROW type column in Flink SQL?

I'm executing a SQL in Flink looks like this: create table team_config_source ( `payload` ROW( `before` ROW( team_config_id int, ... ), `after` ROW( team_config_id int, ... …

apache-flink flink-sql pyflink

asked Dec 05 '22 at 08:23

Rinze

706
1
5
21

1

vote

0 answers

Not able to run simple pyflink word_count.py on aws emr

I have created an EMR cluster (v5.35.0) and am trying to run a sample word_count.py to verify if I am able to execute a flink job. I am able to use python3 as mentioned in this question How do you run pyflink scripts on AWS EMR? Using the below…

python-3.x amazon-web-services apache-flink amazon-emr pyflink

asked Oct 22 '22 at 09:45

user2022525

21
4

1

vote

1 answer

How to read data from HDFS with Flink in python

I want to read data from HDFS with Flink in python I found it possible with Java or Scala : https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/connectors/dataset/formats/hadoop/ Indeed, Flink HDFS connector provides a Sink that writes…

hadoop hdfs apache-flink flink-sql pyflink

asked Sep 08 '22 at 10:03

Zak_Stack

103
8

1

vote

1 answer

PyFlink - RabbitMQ sink : A serializer has already been registered for the state; re-registration is not allowed

In PyFlink coding with Python, I am using Flink 1.15.2 and I source messages from RabbitMQ with the following connector: flink-sql-connector-rabbitmq-1.15.2.jar However, when I try to sink to RabbitMQ with this code, following this link:…

python rabbitmq apache-flink pyflink

asked Sep 05 '22 at 12:42

Ali Ait-Bachir

550
4
9

1

vote

0 answers

Using Python Functions in a Java Flink Job - 1.15

Is there any way to use a python function (Aggregate, Map etc.) within a Java Flink Job? I do not want to exploit SQL API. I wonder if only DataStream API can handle such functionality? Without this syntax: tableEnv.executeSql("CREATE TEMPORARY…

apache-flink flink-streaming flink-sql pyflink

asked Jul 27 '22 at 15:17

Metehan Yıldırım

352
2
10

1

vote

0 answers

pyFlink submit job with multiple external connector jars on Amazon Kinesis

I am follwing this guide to create an Amazon Kinesis Analytics Application with pyflink, and my application requires more than 1 external connector jarfile. When it comes to the jarfile uploading section, it seems I can only upload 1 jarfile, how…

apache-flink pyflink amazon-kinesis-analytics

asked Jul 07 '22 at 06:50

chris

11
2

1

vote

1 answer

Flink Python Datastream API Kafka Consumer - NoClassDefFoundError ByteArrayDeserializer Error

I have an error on Py4j side of the PyFlink. Code is below: env = StreamExecutionEnvironment.get_execution_environment() env.add_jars("file:/" + os.getcwd() + "/jar_files/" + "flink-sql-connector-kafka-1.15.0.jar") type_info = Types.ROW_NAMED(['id',…

java apache-kafka apache-flink flink-streaming pyflink

asked Jul 01 '22 at 13:23

Metehan Yıldırım

352
2
10

1

vote

0 answers

How to convert Table containing TIMESTAMP_LTZ into DataStream in PyFlink 1.15.0?

I have a source table using a Kinesis connector reading events from AWS EventBridge using PyFlink 1.15.0. An example of the sorts of data that are in this stream is here. Note that the stream of data contains many different types of events, where…

apache-flink flink-streaming flink-sql pyflink

asked Jun 24 '22 at 21:34

John

10,837
17
78
141

1

vote

0 answers

How to install Python 3.7 or 3.8 into OpenJDK docker container without building from source?

I want to use the official Flink image on Docker Hub (https://hub.docker.com/_/flink) for use with PyFlink (specifically the 1.14.4 version). That image is based on the openjdk:11-jre base image which is Debian 11.3. Debian 11 does not seem to have…

python docker debian apache-flink pyflink

asked Jun 21 '22 at 20:43

John

10,837
17
78
141

1

vote

1 answer

How to sink message to InfluxDB using PyFlink?

I am trying to run PyFlink walkthough, but instead of sinking data to Elasticsearch, i want to use InfluxDB. Note: the code in walkthough (link above) is working as expected. In order for this to work, we need to put InfluxDB connector inside…

apache-flink flink-streaming influxdb flink-sql pyflink

asked Jun 13 '22 at 13:58

Ermolai

303
4
15

1

vote

0 answers

pyflink 1.15.0 alternative for OutputTag?

I am using ProcessFunction in pyflink (1.15.0) job. One of the use case is to filter out wrong input to different kafka topic. In java, we use OutputTag to redirect those inputs to another stream and then to different sink. In pyflink 1.15.0 I see…

python-3.x apache-flink flink-streaming pyflink

asked May 23 '22 at 05:10

Lakshya Garg

736
2
8
23

1

vote

1 answer

i'm getting this error when running the below pyflink code

this is the code for calculating average of each ch[x] from a kafka source using apache flink(pyflink) i think i have imported all of the necessary libraries And I'm getting this error when running the code from numpy import average from…

python apache-flink flink-streaming flink-sql pyflink

asked Apr 29 '22 at 06:14

Ammanuel

13
3

1

vote

2 answers

pyflink with kafka java.lang.RuntimeException: Failed to create stage bundle factory

・Python3.8 ・JDK 11 I've started learning pyflink and write a code instructed by official web which is https://nightlies.apache.org/flink/flink-docs-master/docs/dev/python/datastream/intro_to_datastream_api/ And here is my code from…

apache-kafka apache-beam flink-streaming py4j pyflink

asked Apr 27 '22 at 12:00

user18964074

13
2

Questions tagged [pyflink]