Questions tagged [pyflink]

Apache Flink is a framework and distributed processing engine for stateful computations over unbounded and bounded data streams. PyFlink makes it available to Python.

PyFlink makes all of Apache Flink available to Python and at the same time Flink benefits from Ptyhon's rich scientific computing functions.

What is PyFlink on apache.org

258 questions
2
votes
1 answer

pyflink AttributeError: type object 'EnvironmentSettings' has no attribute 'in_streaming_mode'

%pyflink from pyflink.table import EnvironmentSettings, StreamTableEnvironment env_settings = EnvironmentSettings.in_streaming_mode() table_env = StreamTableEnvironment.create(environment_settings=env_settings) Will fail with the following…
hi im Bacon
  • 374
  • 1
  • 12
2
votes
0 answers

Error when sink DataStream to JDBC in PyFlink after using map function

My Code: from pyflink.datastream.connectors.jdbc import JdbcSink, JdbcExecutionOptions, JdbcConnectionOptions from pyflink.common.typeinfo import Types from pyflink.datastream import StreamExecutionEnvironment JDBC_JAR_PATH =…
Edgar Le
  • 21
  • 1
2
votes
1 answer

Passing environment variables to Flink job on Flink Kubernetes Cluster

I'm using Flink Kubernetes Operator 1.3.0 and need to pass some environment variables to a Python job. I have followed the official documentation and the example runs fine. How can I inject environment variables so that I can use it inside the…
sunny
  • 27
  • 5
2
votes
1 answer

Using SourceFunction and SinkFunction in PyFlink

I am new to PyFlink. I have done the official training exercise in Java: https://github.com/apache/flink-training However, the project I am working on must use Python as a programming language. I want to know if it is possible to write a data…
IP89
  • 41
  • 5
2
votes
1 answer

Track changes in an aggregated value using PyFlink

Suppose we have an incoming stream of ad click events and we want to track the number of clicks each ad got in the last 5 minutes. Our input schema is ad_id VARCHAR(10), clicked_at TIMESTAMP(3) and the output schema is ad_id …
2
votes
1 answer

Error when pip installing apache-flink due to numpy

I'm trying to install Apache Flink with either python3 -m pip install apache-flink or pip3 install apache-flink, but both fail with an exit code 1 error: clang: error: the clang compiler does not support 'faltivec', please use -maltivec and include…
sozza594
  • 45
  • 4
2
votes
0 answers

ImportError: cannot import name 'Encoder' from 'pyflink.common'

from pyflink.common import WatermarkStrategy, Encoder, Types run above code got below error: ImportError: cannot import name 'Encoder' from 'pyflink.common'
yunrui li
  • 21
  • 1
2
votes
1 answer

How do you run pyflink scripts on AWS EMR?

I am struggling to run the basic word_count.py pyflink example that comes loaded with the apache flink on AWS EMR Steps taken: Successfully created AWS EMR 6.5.0 cluster with the following applications [Flink, Zookeeper] - verified that there is a…
covariantmonkey
  • 213
  • 2
  • 6
2
votes
1 answer

Cannot run pyflink official table tutorial example on macos

I copied the complete example from table_api_tutorial, I can run the example on centos, and My java colleague can run the example on his macbook. env: MacBook Pro (Retina, 13-inch, Late 2013) macos big sur 11.4 $ jenv…
thinker3
  • 12,771
  • 5
  • 30
  • 36
2
votes
1 answer

application mode not support pyflink on yarn

command: /space/flink/bin/flink run-application -t yarn-application -Dyarn.application.name=kafka-hive --pyFiles /space/testAirflow/airflow/dags/ -py /space/testAirflow/airflow/dags/chloe/kafka_to_hive_1.py flink throw a Exception…
zhouyb
  • 29
  • 2
2
votes
0 answers

Cannot use some aggregate functions (sum or count) on Python Flink 1.11.X

Just wanted to check if my code is wrong or this is a pyflink 1.11.X When I try to count the number of elements in a group by query (either 'GroupBy Aggregation' or 'GroupBy Window Aggregation') pyFlink throws the following…
carlos rodrigues
  • 317
  • 5
  • 12
2
votes
1 answer

Flink: Not able to sink a stream into csv

I am trying to sink a stream into filesystem in csv format using PyFlink, however it does not work. # stream_to_csv.py from pyflink.table import EnvironmentSettings, StreamTableEnvironment env_settings =…
yiksanchan
  • 1,890
  • 1
  • 13
  • 37
2
votes
1 answer

PyFlink Vectorized UDF throws NullPointerException

I have a ML model that takes two numpy.ndarray - users and items - and returns an numpy.ndarray predictions. In normal Python code, I would do: model = load_model() df = load_data() # the DataFrame includes 4 columns, namely, user_id, movie_id,…
yiksanchan
  • 1,890
  • 1
  • 13
  • 37
2
votes
1 answer

Failed to unit test PyFlink UDF

I am using PyFlink and I want to unit test my UDF written in Python. To test the simple udf below: # tasks/helloworld/udf.py from pyflink.table import DataTypes from pyflink.table.udf import udf @udf(input_types=[DataTypes.INT(), DataTypes.INT()],…
yiksanchan
  • 1,890
  • 1
  • 13
  • 37
2
votes
1 answer

Got "pyflink.util.exceptions.TableException: findAndCreateTableSource failed." when running PyFlink example

I am running below PyFlink program (copied from https://ci.apache.org/projects/flink/flink-docs-release-1.12/dev/python/table_api_tutorial.html) from pyflink.dataset import ExecutionEnvironment from pyflink.table import TableConfig, DataTypes,…
yiksanchan
  • 1,890
  • 1
  • 13
  • 37
1
2
3
17 18