Questions tagged [pyflink]

Apache Flink is a framework and distributed processing engine for stateful computations over unbounded and bounded data streams. PyFlink makes it available to Python.

PyFlink makes all of Apache Flink available to Python and at the same time Flink benefits from Ptyhon's rich scientific computing functions.

What is PyFlink on apache.org

258 questions
0
votes
0 answers

pyflink on yarn, kafka, NoClassDefFoundError

I deployed my pyflink job on yarn, it include kafka consume. my run command: /opt/flink/bin/flink run -m yarn-cluster -yid application_1634021687380_0009 --jarfile=/opt/flink/lib/flink-sql-connector-kafka_2.11-1.13.2.jar -pyarch venv.zip -pyexec…
Ronnie
  • 1
  • 3
0
votes
1 answer

FLINK : java.io.IOException: Insufficient number of network buffers

I am trying to use flink for data-enrichment on multiple streams of data. Here I have some data in account_stream and status_stream. I want to add that data to all other streams coming from multiple different sources. all the streams have one field…
0
votes
1 answer

when to use Temporary table or permanent table in Flink

New to Flink, I am building a simple aggregation pipeline, e.g. sales amount each day. I am using table api. I see that there are two options creating a table: temporary and permanent. For permanent table, we also need to setup a catalog, e.g. HIVE.…
Alfred
  • 1,709
  • 8
  • 23
  • 38
0
votes
1 answer

Is Conversion between DataStream and Table supported for Python API

Is Conversion between DataStream and Table supported for Python API in the latest stable version V1.13.2?
saktheesh
  • 263
  • 1
  • 2
  • 6
0
votes
2 answers

Querying nested fields in Flink SQL 1.11

I have a schema that looks like this: Table: org_table `transaction_amt` VARCHAR(64) NOT NULL, `transaction_adj_amt` BIGINT NOT NULL , `event_time` TIMESTAMP(3), `fd_output` ROW<`restime` BIGINT `outcome` VARCHAR(64)>, When I query this table like…
0
votes
2 answers

Can't run basic PyFlink example

I have this toy pipeline from pyflink.datastream import StreamExecutionEnvironment def pipeline(): # Create environment env = StreamExecutionEnvironment.get_execution_environment() env.set_parallelism(1) ds =…
0
votes
2 answers

Flink Python Datastream API Kafka Producer Sink Serializaion

I'm trying to read data from one kafka topic and writing to another after making some processing. I'm able to read data and process it when i try to write it to another topic. it gives the error If I try to write the data as it is without doing any…
0
votes
1 answer

loading zeek connection data to pyflink

Trying to load data like this(zeek connection data) to pyflink. My problem is the id fields that have a name with a dot because they were originally a tuple in zeek. { "ts": 1584544201.798601, "uid": "CSgDnESdxqqAN88H3", "id.orig_h":…
Ben
  • 1
  • 2
0
votes
0 answers

How to follow a updating local file while using flink

As mentioned in the document: For example a data pipeline might monitor a file system directory for new files and write their data into an event log. Another application might materialize an event stream to a database or incrementally build and…
Jason Pan
  • 702
  • 7
  • 21
0
votes
1 answer

Is there a way that I can use Azure SQL server as data source?

My project is currently built on Azure (data are stored in Azure SQL server), I am currently trying to introduce streaming/batching process ability to my project by leveraging PyFilnk. However,I didn't find any document about how to connect PyFlink…
0
votes
1 answer

pyflink configuration error SQL parse fail

I want to be sure that flink works in terms of setting and then try to complicate the usage. In the most simple example I tryed to do this stuff. The input contains column_a,column_b 1,2 The output exists. In order to download the pyflink version of…
Andrea Lombardo
  • 101
  • 3
  • 10
0
votes
1 answer

pyflink winerror 2 get_gateway()

Following a script on stackOverflow link as described by the link with the only change of the data in the sense I would like to evaluate the simple usage of pyflink in order to analyze if the installation of it works.You can see below the system and…
Andrea Lombardo
  • 101
  • 3
  • 10
0
votes
1 answer

I'm looking for code to connect PyFlink to Pulsar

I've been trying various examples, but I need to see how to connect PyFlink to Pulsar. I have Pulsar 2.8.0, Flink 1.13.1 and Scala 2.11. I just need to see how to set the parameters for PyFlink to connect to a topic on Pulsar. Please help.
0
votes
1 answer

Using Python Processors in Java Flink Application

I have a use case where I want to implement an AWS Kinesis Data Application with Flink in Java. It will listen to multiple Kinesis streams via the Data Streams API. However, the analysis of those streams will be done in Python (since our data…
Alex
  • 349
  • 1
  • 10
0
votes
1 answer

Pyflink get_gateway() method is not working

I am using PyCharm to run simple pyflink latest 1.13 example (it is only two lines). But the get_gateway() method is not working. For the interpreter I created virtual environment and use it in my PyCharm project. I also downloaded Java8. I tried…
dodo
  • 435
  • 2
  • 5
  • 11