Questions tagged [pyflink]

Apache Flink is a framework and distributed processing engine for stateful computations over unbounded and bounded data streams. PyFlink makes it available to Python.

PyFlink makes all of Apache Flink available to Python and at the same time Flink benefits from Ptyhon's rich scientific computing functions.

What is PyFlink on apache.org

258 questions
0
votes
0 answers

The problem in data transmission between flink and kafka running on python in windows environment

I am a beginner of flink and kafka. I started zookeeper and kafka in windows system, and tried to test the example 'Kafka With Json Format' in official website in python environment. import logging import sys from pyflink.common import Types from…
Sherri
  • 1
0
votes
1 answer

Flink Batch vs Stream how they process real time data

I have read the documentation of streaming mode and batch mode. I assume that if I have an unbounded stream and I apply windows (like tumbling) on it it becomes a bounded stream? Please correct and stop me here if I am wrong. But that's what I…
0
votes
1 answer

Apache Flink - Getting `NoResourceAvailableException` with local execution while using `slot_sharing_group`

I'm trying to run a Flink 1.17.1 job with local execution through PyCharm. My code is using the DataStream API and I'm reading data from a Kafka topic and printing it to the console with .execute().print(). However, I'm encountering some errors when…
ElCapitaine
  • 830
  • 1
  • 11
  • 24
0
votes
0 answers

Apache Flink on python UDFs are not working. Despite following AWS repo on pyflink UDFs

I see AWS has a Kinesis-DataAnalytics stream with Apache Flink. And on the pyflink library for python there is a way to create UDF. In fact, AWS seems to have a repo with examples for pyflink UDF. I have tried following the guide, and doing the UDF…
0
votes
0 answers

How to use flink CLI on windows using their docker approach

I have deployed a k8s local deployment of flink on minikube by using their helm chart: helm install -n flink riskfocus/flink --generate-name After that, on my PC I opened localhost:8081/ and I do see the UI as it should, with 4 available task…
0
votes
0 answers

Flink WordCount.jar sample file not running on local kubernetes

I have started a local k8s deployment of flink by using their helm chart on my minikube cluster helm install -n flink riskfocus/flink --generate-name After that, on my PC I opened localhost:8081/ and I do see the UI as it should, with 4 available…
0
votes
0 answers

Output of User-Defined Table Function Cannot to Converted to Stream

I have a datastream that I am converting to a Table so that I may use Flink SQL syntax on the table. My Flink SQL query includes a user-defined Table Function, meaning each input row results in multiple output row, using a generator as in the…
0
votes
0 answers

Apache PyFlink Kafka connector not working on windows

I am getting following error : py4j.protocol.Py4JJavaError: An error occurred while calling o0.execute. : org.apache.flink.runtime.client.JobExecutionException: Job execution failed. Version: Windows 10 , Python: 3.8.10, py4j: 0.10.9.7 ,…
user3024119
  • 190
  • 1
  • 2
  • 11
0
votes
0 answers

What steps do I need to take to make Kafka and Flink work on EMR without getting 'The AdminClient thread has exited. Call: describeTopics' error?

I am trying to use a kafka source in with the pyflink table API as follows: logging.basicConfig(stream=sys.stdout, level=logging.INFO, format="%(message)s") env =…
0
votes
0 answers

'CoderAdapterIml' object has no attribute 'encode' error with Custom Serializer pyflink

I am trying to create a custom window, and right now I am facing problems with serialization. I have this as my current serializer class CustomSerializer(TypeSerializer): def serialize(self, element, stream) -> None: bytes_data =…
0
votes
0 answers

PyFlink window aggregation not triggering

I have a problem that my window aggregation accumulates all results, but do not return it, and my result stream is empty I suspect it has something to do with windows triggering, but cannot figure out…
Qwetroman
  • 11
  • 3
0
votes
0 answers

How we process DataStream from kafka to flink using table api and datastream in pyflink

can we use apache flink table api with datastream from kafka together to process pipeline in pyflink? We are using flink 1.16.0
Gulfraz
  • 23
  • 5
0
votes
1 answer

Is this my approach of Dynamic Windowing with Apache Flink goes against the Flink's architecture?

I’ve been trying to emulate the behavior of a dynamic window, as Flink does not support dynamic window sizes. My operator inherits from KeyedProcessFunction, and I’m only using KeyedStates to manipulate the window_size. I’m clearing the KeyedStates…
0
votes
0 answers

Records not showing in Flink when watermarking is activated

Trying to run the a flink application locally with a local Kinesis stream. The following code works perfectly (as in, records can be seen in the sink table path), but when I change the watermark from event_ts to event_ts - INTERVAL '10' SECOND, the…
0
votes
0 answers

Apache Flink: Pyflink job erroring trying to consume from Kakfka using new Flink KafkaSource API

I'm trying to consumer from Kafka topic using Flink Datastream Kafka connector, described in the official documentation [here][1] I'm using pyflink for Python, and running very simple example which looks like this : from pyflink.common import…