Questions tagged [pyflink]

Apache Flink is a framework and distributed processing engine for stateful computations over unbounded and bounded data streams. PyFlink makes it available to Python.

PyFlink makes all of Apache Flink available to Python and at the same time Flink benefits from Ptyhon's rich scientific computing functions.

What is PyFlink on apache.org

258 questions
0
votes
2 answers

pyflink JDBC Postgresql Catalog throwing errors for data type UUID , How to handle the uuid datatype in Flink Table API?

Apache Flink 1.11.0 Python Table API Catalog : postgresql Reading and writing data from postgresql Catalog tables which coantain UUID data type columns through Table API throwing UUID data type unsupportedOperatorException. How to handle the UUID…
bharath
  • 1
  • 1
0
votes
1 answer

PyFlink 1.11.2 couldn’t configure [taskmanager.memory.task.off-heap.size] property when registering custom UDF function

I'm very new at pyflink and trying to register a custom UDF function using python API. Currently I faced an issue in both server env and my local IDE environment. When I'm trying to execute the example below I got an error message: The configured…
0
votes
2 answers

How can you load a CSV into PyFlink as a Streaming Table Source?

I am trying to setup a simple playground environment to use the Flink Python Table API. The Jobs I am ultimately trying to write will feed off of a Kafka or Kenesis queue, but that makes playing around with ideas (and tests) very difficult. I can…
0
votes
1 answer

Apache-Flink 1.11 Unable to use Python UDF through SQL Function DDL in Java Flink Streamming Job

In Flip-106 there is an example of how to call a user-defined python function in a batch job java application through SQL Function DDL... BatchTableEnvironment tEnv =…
0
votes
0 answers

Calculations on Time Attribute in Apache Flink more than once

As the title says, I want to do calculations involving time attribute in a flink job more than once.The documentation mentions that once a time attribute is used in a calculation, it becomes a regular timestamp and unable to be used in a calculation…
Fakhruzy
  • 13
  • 3
0
votes
1 answer

RabbitMQ custom table source and sink for pyflink 1.11

According to the docs here: https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/table/connectors/ and https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/table/connectors/ is it possible to create a custom ddl RabbitMQ…
0
votes
1 answer

python Flink instalation Failed to find the file /flink-1.12-SNAPSHOT-bin/flink-1.12-SNAPSHOT/opt/flink-python_*.jar

I am following this for installing Flink After installing all the dependencies mentioned there. Which ran perfectly well git clone https://github.com/apache/flink mvn clean install -DskipTests mvn clean install -DskipTests -Dfast sudo apt-get…
Ajay Chinni
  • 780
  • 1
  • 6
  • 24
0
votes
1 answer

Using Python in Apache Flink to do analytics

Is there any way to do analytics on flink jobs using Python specifically? The table api as far as I understand will retrieve the data and we can only use the functions in the api to transform the data. Did i miss or misunderstand anything in the…
Fakhruzy
  • 13
  • 3
0
votes
1 answer

Where to find the relevant PyFlink doc?

I'm a newer to Flink and I found Flink have supported Python called PyFlink. But I don't know where to find the relevant PyFlink doc or examples?
Xingbo Huang
  • 363
  • 1
  • 5
0
votes
3 answers

Python Version of WordCount is failing on Flink

On CentOS (CentOS 8.0.1905 (64bit)) I tried to run Python(3.6.8) Version of WordCount program on Flink(1.9) as described here. I got error as below. The same environment works fine with Java version of WordCount program. What is that I am missing…
Sarma
  • 11
  • 4
0
votes
2 answers

How does a PyFlink job call external jar?

I want to call my Java interfaces in a jar file in a PyFlink job. No solutions are found in the offical document.
Xingjun Wang
  • 413
  • 2
  • 4
  • 17
-1
votes
1 answer

How to solve "avro.AvroRowDeserializationSchema does not exist in the JVM" when trying to get data using pyflink

Pyflink getting data from kafka producer. When producer sends data in json-fomat, JsonRowDeserializationSchema works fine, but when I'm sending data in avro format, AvroRowDeserializationSchema falls with next exception: Exception in thread…
-1
votes
0 answers

pyflink aggfunction in window tvf can not sink connection='kafka', it notice consuming update changes

data source: class Sum0(AggregateFunction): def get_value(self, accumulator): return accumulator[0] def create_accumulator(self): return Row(0) def accumulate(self, accumulator, *args): if args[0] is not None: …
faronzz
  • 117
  • 1
  • 1
  • 4
-1
votes
1 answer

Flink Hbase connector: write data in Hbase sink table : Unable to create a sink for writing table

I want to write data in hbase sink table, I have Hbase version 2.2.0 which is compatible flink version 1.14.4 I defined the sink hbase table as follows: sink_ddl = """ CREATE TABLE hTable ( datemin STRING, family2…
Zak_Stack
  • 103
  • 8
-1
votes
1 answer

Create pyFlink DataStream Consumer from Tweets Kafka Producer in Python

I want to create I stream kafka consumer in pyFlink, which can read tweets data after deserialization (json), I have pyflink version 1.14.4 (last version) Can I have an example of kafka producer and a simple code of flink consumer for stream in…
Zak_Stack
  • 103
  • 8
1 2 3
17
18