Questions tagged [pyflink]

Apache Flink is a framework and distributed processing engine for stateful computations over unbounded and bounded data streams. PyFlink makes it available to Python.

PyFlink makes all of Apache Flink available to Python and at the same time Flink benefits from Ptyhon's rich scientific computing functions.

What is PyFlink on apache.org

258 questions
0
votes
1 answer

How to have multiple INSERT statements within single PyFlink SQL job?

Is it possible to have more than one INSERT INTO ... SELECT ... statement within a single PyFlink job (on Flink 1.13.6)? I have a number of output tables that I create and I am trying to write to write to these within a single job, where the example…
John
  • 10,837
  • 17
  • 78
  • 141
0
votes
1 answer

Is there an example of PyFlink SQL unit testing in a self-contained repo?

Is there an example of a self-contained repository showing how to perform SQL unit testing of PyFlink (specifically 1.13.x if possible)? There is a related SO question here, where it is suggested to use some of the tests from PyFlink itself. The…
John
  • 10,837
  • 17
  • 78
  • 141
0
votes
1 answer

What is recommended way to automate Flink Job submission on AWS EMR cluster while pipeline deployment

I am new to Flink and EMR cluster deployment. Currently we have a Flink job and we are manually deploying it on AWS EMR cluster via Flink CLI stop/start-job commands. I wanted to automate this process (Automate updating flink job jar on every…
priyadhingra19
  • 333
  • 4
  • 15
0
votes
0 answers

Flink Oracle Connection

I am using AWS Kinesis Studio which supports Flink 1.13. I see that Flink 1.13 does not support Oracle connection. Based on the documentation of version 1.13, it support MySQL, PostgreSQL,…
0
votes
1 answer

Problem when running the first Flink python code

I want to run my first flink code, so I created a virtual environement and I run it with: python tab.py I find : What's wrong with my Pyflink setup that Python UDFs throw py4j exceptions? but it doesn't work.
Zak_Stack
  • 103
  • 8
0
votes
1 answer

How to find any factory for identifier 'elasticsearch-7' that implements 'org.apache.flink.table.factories.DynamicTableFactory' in the classpath

I am trying to create a pyflink application with table API and elasticsearch as sink. from pyflink.table import TableEnvironment, EnvironmentSettings def log_processing(): env_settings =…
kell
  • 1
  • 1
0
votes
1 answer

Invoking Flink stateful function using REST API

I'm writing an Apache Flink Statefun application using python. I'm looking for help/pointers to invoke existing stateful function via a REST POST/GET call. I referred to…
Himanshu
  • 3
  • 2
0
votes
1 answer

I have configured my Flink Application using PyFlink, but I want to change the Job Name

I have configured Amazon Kinesis Data Analytic using PyFlink, but I want to change the Job Name to whatever I want. How can I do this?
0
votes
1 answer

if it's possible to run batch processing on dynamic table in flink

Currently I run multiple varient structured ETL job on the same table by the following steps: sync data from RDBMS to data warehouse continuously. run multiple ETL at different time(data in data warehouse at corresponding timepoint). If it's…
0
votes
1 answer

Apache Flink Vectorized UDF Throws Series Ambiguous Error

I am trying to use apache flink to convert latitude and longitude to WGS4 Co-ordinate using pyproj Library. I want to use Vectorized UDF. But whenever i pass data to VDUF. It throws error. Truth value of a Series is ambiguous. Use a.empty,…
Sudip Adhikari
  • 157
  • 2
  • 12
0
votes
2 answers

Rest API to submit PyFlink job

Is there any way to submit a PyFlink job to a cluster using Rest API? I check out this link https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/ops/rest_api/ but did not find any API related to python files. Thanks!
thaoht
  • 1
0
votes
2 answers

pyflink, ImportError: No module named pyflink

I am testing pyflink on os: centos7 flink version: flink-1.14.3 virtualenv python version: Python 3.6.8 pip list: apache-beam 2.27.0 apache-flink 1.14.3 apache-flink-libraries 1.14.3 avro-python3 1.9.2.1 certifi …
David
  • 103
  • 1
  • 9
0
votes
1 answer

Pyflink view data inside Jupyter notebook like Head/Show in pandas/pyspark

I'm new with PyFlink and trying to learn. I was using pyflink with jupyter Notebook and performing some basic operations. So when I perform certain operation it return a table location in memory like ". To…
0
votes
2 answers

pyflink TableException: Failed to execute sql

I use pyflink to run flink streaming, if I run flink with StandAlone mode, it works, but run flink with yarn-per-job mode, it failed, report "pyflink.util.exceptions.TableException: Failed to execute sql" yarn per job command is: flink run -t…
0
votes
0 answers

Pyflink session window aggregation by separate keys

I'm trying to wrap my head around the pyflink datastream api. My use case is the following: The source is a kinesis datastream consisting of the…