Questions tagged [pyflink]

Apache Flink is a framework and distributed processing engine for stateful computations over unbounded and bounded data streams. PyFlink makes it available to Python.

PyFlink makes all of Apache Flink available to Python and at the same time Flink benefits from Ptyhon's rich scientific computing functions.

What is PyFlink on apache.org

258 questions
2
votes
1 answer

PyFlink - Kafka - Missing module

I am trying to start with PyFlink and Kafka, but get below error. Thanks for your support ! Installation python -m pip install apache-flink pip install pyFlink Code from pyFlink.datastream import…
py-r
  • 419
  • 5
  • 15
2
votes
1 answer

PyFlink - Issue using Scala UDF in JAR

I'm trying to register a Scala UDF in Pyflink using an external JAR as follows, but get below error. Scala UDF: package com.dummy import org.apache.flink.table.functions.ScalarFunction class dummyTransform(factor: Int) extends ScalarFunction { …
py-r
  • 419
  • 5
  • 15
2
votes
1 answer

Sink processed stream data into a database using Apache-flink

Is it possible to sink processed stream data into a database using pyflink? All methods to write processed data are limited to save them in the txt, csv or Json formats and there is no way to sink data with database.
2
votes
1 answer

Apache-Flink 1.11 Unable to use Python UDF in SQL Function DDL

According to this confluence page: https://cwiki.apache.org/confluence/display/FLINK/FLIP-106%3A+Support+Python+UDF+in+SQL+Function+DDL python udf are in Flink 1.11 available to be used with in SQL functions. I went to the flink docs…
2
votes
1 answer

PyFlink - specify Table format and process nested JSON string data

I have a JSON data object as such: { "monitorId": 865, "deviceId": "94:54:93:49:96:13", "data": "{\"0001\":105.0,\"0002\":1.21,\"0003\":0.69,\"0004\":1.46,\"0005\":47.43,\"0006\":103.3}", "state": 2, "time": 1593687809180 } The…
343GuiltySpark
  • 123
  • 1
  • 2
  • 11
2
votes
2 answers

where's socketTextStream in pyflink

I want to translate the following code into pyflink and run it in pyflink-shell.sh afterwards. public class MapDemo { private static int index = 1; public static void main(String[] args) throws Exception { //1.获取执行环境配置信息 …
user13268019
1
vote
0 answers

"Copy" instead "insert into" in Postgresql, using pyflink methods

I wrote a pyflink consumer, which getting messages from kafka. It works fine. Here is a part of my code: def _read_from_kafka(self): bootstrap_server = self.config['source']['kafka']['host'] bootstrap_port =…
1
vote
0 answers

Cannot load user class error in Flink with Python user-defined table functions (UDTFs)

I ran into a pretty strange behavior today. I'm using Flink 1.17.1 and PyFlink setting up something with the Kafka connector and Python user-defined table functions (UDTFs). I've found both a workaround and a solution in the end, I'm posting here…
tudorpavel
  • 149
  • 7
1
vote
0 answers

Unable to set logging level, and logs inside operators not shown in local execution

For local execution and testing, I'd like to show more log messages in my Python Flink application. However, setting the logging level has no effect and more importantly no logs that happen during actual execution of the graph are shown. Note that…
Felix
  • 2,548
  • 19
  • 48
1
vote
0 answers

How to deploy a scalable pyflink "app" the right way?

We want to keep things simple and use python wherever possible. Thus we want to use pyflink (latest version, we are flexible) for continuous queries. We have written the code (which pulls live data from a pulsar cluster with the flink-pulsar…
feder
  • 1,849
  • 2
  • 25
  • 43
1
vote
1 answer

Reading from Kafka with pyflink not working

This is apache flink example with pyflink. Link I want to read records from the kafka topic and print them with pyflink. When I want to produce to kafka topic it works, but I can't read from that topic. Error when consuming from kafka topic: raise…
Mohammadreza Khedri
  • 2,523
  • 1
  • 11
  • 22
1
vote
0 answers

Pyflink fails converting datetime when executing and collecting SQL timestamp

I'd like to test some streams I've created with execute_and_collect instead of a JDBC sink. The sink succeeds in converting a Row to insert data into a DB, but execute_and_collect fails with: AttributeError: 'bytearray' object has no attribute…
Felix
  • 2,548
  • 19
  • 48
1
vote
0 answers

How to pass arguments to Python script or read config files in Python scripts when using Flink?

For example, I'm submitting a python script my_driver.py to Flink by running bin/flink run --python my_driver.py. Can I pass any user-defined arguments to my_driver.py? Or can I open and read a config file in my_driver.py
Rinze
  • 706
  • 1
  • 5
  • 21
1
vote
0 answers

How to use map with kafkaSource and KafkaSink at pyflink

I'm pretty new at flink and I'm trying to use pyflink(1.16.0) to read from an input topic and than map it before sinking it to output topic. However, I'm getting an error. Here is my code: from random import randrange from pyflink.common import…
emce
  • 11
  • 2
1
vote
0 answers

Flink SQL doesn't unpack gzipped source on the fly - but still parses PART of it

UPD: i actually found jira ticket which describes my problem here -https://issues.apache.org/jira/browse/FLINK-30314 Waiting for it's resolution... I've met a strange issue and i need to ask you guys if im not missing anything. I have an issue with…
1 2
3
17 18