Questions tagged [pyflink]

Apache Flink is a framework and distributed processing engine for stateful computations over unbounded and bounded data streams. PyFlink makes it available to Python.

PyFlink makes all of Apache Flink available to Python and at the same time Flink benefits from Ptyhon's rich scientific computing functions.

What is PyFlink on apache.org

258 questions
0
votes
1 answer

pyflink use kafka_source_ddl an error occurred

question: i want to use pyflink read kafka msg run commad./bin/flink run -m 10.0.24.13:8081 -py /usr/local/project/cdn_flink/cdn_demo.py ,show error: File "/usr/local/flink-1.14.2/opt/python/pyflink.zip/pyflink/table/table.py", line 1108, in…
shark.shi
  • 3
  • 1
0
votes
1 answer

Apache Flink - WordCount - NoResult - PyFlink

I have developed a Word Count program using PyFlink. The program is not throwing any error yet not providing a desired output. According to the code, the program should create a new text file but no file is generating at the the time of execution.…
0
votes
0 answers

Flink-Kafka in Python: Deserialization Error

I'm trying to build a custom tumbling window with PyFlink by reading data from Kafka source. I have a Kafka input topic made by 7 partitions. My KafkaProducer sends a message every 15 secs. This is an example: Producer messages. Each message is sent…
0
votes
1 answer

Flink split pipeline

Why does flink split the pipeline into several jobs if there is an execute_insert in the pipeline? docker-compose exec jobmanager ./bin/flink run --pyModule my.main -d --pyFiles /opt/pyflink/ -d Job has been submitted with JobID…
0
votes
1 answer

Flink SQL running out of memory doing Select - Insert from RDS to Mysql

In my pipeline I am using pyflink to load & transform data from an RDS and sink to a MYSQL. Using FLINK CDC I am able to get the data I want from the RDS and with JDBC library sink to MYSQL. My aim is to read 1 table and create 10 others using a…
0
votes
0 answers

ImportError: cannot import name 'OldCsv' from 'pyflink.table.descriptors'

I have recently started using flink for data processing. When I tried to execute table api for hashtag count by importing pyflink, Im not able to import OldCsv and FileSystem from pyflink.table.descriptors. I have also downloaded apache-flink using:…
0
votes
1 answer

import local packages inside PyFlink

I'm trying to write a local package in PyFlink project. But I can only import via relative path. like from .package import func Can I use absolute paths in packages inside PyFlink project imported as env.add_python_file('/path_to_project') ?
0
votes
1 answer

How to read multi directories with Table API in PyFlink?

I want to read multi directories with Table API in PyFlink, from pyflink.table import StreamTableEnvironment from pyflink.datastream import StreamExecutionEnvironment, RuntimeExecutionMode if __name__ == 'main__': env =…
chenyf
  • 5,048
  • 1
  • 12
  • 35
0
votes
2 answers

Flink SQL Tumble Aggregation result not written out to filesystem locally

Context I have a Flink job coded by python SQL api. it is consuming source data from Kinesis and producing results to Kinesis. I want to make a local test to ensure the Flink application code is correct. So I mocked out both the source Kinesis and…
Alfred
  • 1,709
  • 8
  • 23
  • 38
0
votes
1 answer

Pyflink: A group window expects a time attribute for grouping in a stream environment

I have the following code using pyflink 1.11: import os from pyflink.datastream.stream_execution_environment import StreamExecutionEnvironment from pyflink.datastream.time_characteristic import TimeCharacteristic from pyflink.table import ( …
Larissa Leite
  • 1,358
  • 3
  • 21
  • 36
0
votes
1 answer

Pyflink 1.14 table connectors - Kafka authentication

I've only seen Pyflink table API examples of kafka connections which does not contain authentication details in the connection establishment (doc ref), namely source table connection: source_ddl = """ CREATE TABLE source_table( a…
Paul
  • 756
  • 1
  • 8
  • 22
0
votes
1 answer

Running Flink's python examples

I'm trying to run the python examples that come with Apache Flink 1.14.0 but keep getting errors. I installed per the instructions, and the .jar examples work fine, so I'm not sure what the issue is with the python. For instance, the command…
BBrooklyn
  • 350
  • 1
  • 3
  • 15
0
votes
1 answer

how to consume two source stream with table API in Flink

I have a Flink Job that use the python table api. Now my application is going to consume an additional source stream. I am curious what the recommended way to consume multiple source stream with table API. Additional information: The two input…
Alfred
  • 1,709
  • 8
  • 23
  • 38
0
votes
0 answers

How to import a private package in PyFlink

I'm trying to import a private package in PyFlink. I have an UDF that needs to call a function in a private library (not available through pip install because the repo is private). Therefore I can't use something like…
ElCapitaine
  • 830
  • 1
  • 11
  • 24
0
votes
1 answer

Windowed grouping using message keys in PyFlink

I'm using PyFlink 1.13 for a project and I'm trying to do the following: Read data from Kafka topic where messages contain a UserId Perform tumbling windowing over 2 seconds on the data Call a Python UDF with my windows values Here's a visual…
ElCapitaine
  • 830
  • 1
  • 11
  • 24