Questions tagged [flink-sql]

Apache Flink features two relational APIs, SQL and Table API, as unified APIs for stream and batch processing.

Apache Flink features two relational APIs:

  1. SQL (via Apache Calcite)
  2. Table API, a language-integrated query (LINQ) interface

Both APIs are unified APIs for stream and batch processing. This means that the a query returns the same result regardless whether it is applied on a static data set or a data stream. SQL queries are parsed and optimized by Apache Calcite (Table API queries are optimized by Calcite).

Both APIs are tightly integrated with Flink's DataStream and DataSet APIs.

667 questions
2
votes
1 answer

PyFlink Vectorized UDF throws NullPointerException

I have a ML model that takes two numpy.ndarray - users and items - and returns an numpy.ndarray predictions. In normal Python code, I would do: model = load_model() df = load_data() # the DataFrame includes 4 columns, namely, user_id, movie_id,…
yiksanchan
  • 1,890
  • 1
  • 13
  • 37
2
votes
1 answer

Compare Flink Table API to join table, and DataStream.join()

I try to join two DataStreams by IDs, and found there are two API set can do…
2
votes
1 answer

Flink Jdbc sink

I have created an application where I read data from Kinesis streams and sink the data into mysql table. I tried to load test the app. For 100k entries it takes more than 3 hours. Any suggestion why it's happening so slow. One more thing is the…
user7665394
  • 39
  • 1
  • 5
2
votes
1 answer

Flink SQL Client connect to secured kafka cluster

I want to execute a query on Flink SQL Table backed by kafka topic of secured kafka cluster. I'm able to execute the query programmatically but unable to do the same through Flink SQL client. I'm not sure on how to pass JAAS config…
Markiv
  • 317
  • 1
  • 5
  • 13
2
votes
0 answers

How to assign a unique ID to each row in a table in the Flink Table API?

I'm using Flink to compute a series of operations. Each operation produces a table which is both used for the next operation as well as stored in S3. This makes it possible to view the data at each intermediate step in the calculation and see the…
Alex Hall
  • 34,833
  • 5
  • 57
  • 89
2
votes
1 answer

Failed to deserialize Avro record - Apache flink SQL CLI

I'm publishing avro serialized data to kafka topic and then trying to create Flink table from the topic via SQL CLI interface. I'm able to create the topic but not able to view the topic data after executing SQL SELECT statement. Howver, I'm able…
Markiv
  • 317
  • 1
  • 5
  • 13
2
votes
2 answers

Select all fields as json string as new field in Flink SQL

I am using Flink Table API. I have a table definition that I want to select all fields and convert them to a JSON string in a new field. My table has three fields; a: String, b: Int, c: Timestamp. If I do INSERT INTO kinesis SELECT a, b, c from…
lalala
  • 63
  • 1
  • 6
2
votes
1 answer

Write the result of SQL Query to file by Apache Flink

I have the following task: Create a job with SQL request to Hive table; Run this job on remote Flink cluster; Collect the result of this job in file (HDFS is preferable). Note Because it is necessary to run this job on remote Flink cluster i can…
2
votes
1 answer

How to handle exceptions in Apache flink KeyedBroadCastProcessFunction

I am new to Flink i am doing of pattern evaluation using Flink KeyedBroadCastProcessFunction some thing in similar lines to (https://flink.apache.org/2019/06/26/broadcast-state.html) and i am using JAVA to develop my code but i am not getting how…
user13906258
  • 161
  • 1
  • 13
2
votes
1 answer

PyFlink - specify Table format and process nested JSON string data

I have a JSON data object as such: { "monitorId": 865, "deviceId": "94:54:93:49:96:13", "data": "{\"0001\":105.0,\"0002\":1.21,\"0003\":0.69,\"0004\":1.46,\"0005\":47.43,\"0006\":103.3}", "state": 2, "time": 1593687809180 } The…
343GuiltySpark
  • 123
  • 1
  • 2
  • 11
2
votes
1 answer

NullPointer Exception while trying to access or read ReadOnly ctx in processElement method in KeyedBroadCastProcessFunction in Apache Flink

I have some interesting scenario i am working on pattern matching in flink evaluating the incoming patterns using keyedbroadcastprocessfunction, when i am running the program in IDE i am getting null pointer exception in processElements function…
YRK
  • 153
  • 1
  • 1
  • 22
2
votes
1 answer

Flink Process Function is not returning the data to Sideoutputstream

I am trying to validate JSONObject with set of rules if the json matches with set of rules is it will return the matched rule and the JSONObject if not it will return a JSONObject to Sideoutput all this is processed in a ProcessFuntion, i am getting…
YRK
  • 153
  • 1
  • 1
  • 22
2
votes
1 answer

Flink : DataStream to Table

Usecase: Read protobuf messages from Kafka, deserialize them, apply some transformation (flatten out some columns), and write to dynamodb. Unfortunately, Kafka Flink Connector only supports - csv, json and avro formats. So, I had to use lower level…
Nitin Pandey
  • 649
  • 1
  • 9
  • 27
2
votes
2 answers

Why not able to add proctime by call addColumns?

tableEnv.fromDataStream(xxxStream).addColumns('processTime.proctime) The above code will throw excetion: org.apache.flink.table.api.ValidationException: Window properties can only be used on windowed tables. but this will…
xiemeilong
  • 643
  • 1
  • 6
  • 21
2
votes
1 answer

How to define an apache flink table with row time attribute

I have json rows coming as my data, I want to create a table out of it. StreamTableEnvironment fsTableEnv = StreamTableEnvironment.create(streamExecutionEnvironment, fsSettings); String allEventsTable = "allEventsTable"; …
sky
  • 260
  • 3
  • 12