Questions tagged [flink-sql]

Apache Flink features two relational APIs, SQL and Table API, as unified APIs for stream and batch processing.

Apache Flink features two relational APIs:

  1. SQL (via Apache Calcite)
  2. Table API, a language-integrated query (LINQ) interface

Both APIs are unified APIs for stream and batch processing. This means that the a query returns the same result regardless whether it is applied on a static data set or a data stream. SQL queries are parsed and optimized by Apache Calcite (Table API queries are optimized by Calcite).

Both APIs are tightly integrated with Flink's DataStream and DataSet APIs.

667 questions
2
votes
1 answer

Apache Flink: How to remove duplicates within select query?

How to remove duplicates within a SELECT query using Apache fFlink? My table is: and I want to remove duplicates in ID with respect to keeping maximum in range
Waqar Babar
  • 109
  • 1
  • 7
2
votes
1 answer

Can Flink State replace an external database

I have a Flink project that receives an events streams, and executes some logic to add a flag of this event, then it saves the flag and the eventID for a while to be reused or to be queried by other system. in this case, the volume of data is not…
Leyla Lee
  • 466
  • 5
  • 19
2
votes
1 answer

Apache Flink: How to query a relational database with the Table API?

The following code snippet is taken from this blog post: val sensorTable = ??? // can be a CSV file, Kafka topic, database, or ... // register the table source tEnv.registerTableSource("sensors", sensorTable) I would like to read data from a…
Amit Dass
  • 41
  • 6
2
votes
1 answer

Apache Flink Table 1.4: External SQL execution on Table possible?

is it possible to query an existing StreamTable externally, without uploading a .jar get the execution environment and retrieve the table environment? I had waited for Apache Flink Table 1.4 release, because of its dynamic (continuous) table…
lkaupp
  • 551
  • 1
  • 6
  • 17
2
votes
1 answer

Table API Scala

I am trying to join two tables using flink scala table api. I have one table with two(a,b) columns and another table with one (c)column i want to join the two tables to a bigger table having three(a, b, c) columns. I simply want to join them i don't…
1
vote
1 answer

Flink TableAPI: PartitionedBy columns missing in Parquet files

I’m using filesystem connector to sink data into S3 in parquet format using TableAPI. I observed the partitionedBy columns are missing in the parquet file. Here are the queries I’m using: CREATE TABLE data_to_sink ( record_id STRING NOT NULL, …
user3497321
  • 443
  • 2
  • 6
  • 15
1
vote
0 answers

Temporal join on Temporary views

Can the temporal join be achieved on Temporary views? I’m using Flink SQL to create 2 tables using kafka and upsert-kafka connectors respectively. Applying some transformation on these tables and then creating temporary views. The final query joins…
user3497321
  • 443
  • 2
  • 6
  • 15
1
vote
3 answers

How to restart flink from a savepoint from within the code

I have a java class that is submitting a sql files to flink cluster. I have StreamExecutionEnvironment streamExecutionEnvironment =…
user3822232
  • 141
  • 1
  • 8
1
vote
0 answers

Watermark Strategy after a simple transformation on the table

Sorry for the simple question, but I am struggling to understand how to find out whether result of a given query has watermark or not. For example, I define my source in Datastream api and then convert it to Table API leveraging SOURCE_WATERMARK()…
tootso
  • 11
  • 2
1
vote
0 answers

Flink SQL doesn't unpack gzipped source on the fly - but still parses PART of it

UPD: i actually found jira ticket which describes my problem here -https://issues.apache.org/jira/browse/FLINK-30314 Waiting for it's resolution... I've met a strange issue and i need to ask you guys if im not missing anything. I have an issue with…
1
vote
0 answers

Propagating time attribute after an interval join, to do window aggregation later

I am attempting to change query 4 (Average Price for a category) from nexmark benchmark to produce an append only stream of average price for a category, using only flink sql as much as possible. Query for the original description is available here…
Vignesh Chandramohan
  • 1,306
  • 10
  • 15
1
vote
0 answers

Imported Protobuf Message Type Support in Apache Flink

I am experiencing an issue when trying to using the "com.google.protobuf.Any" message type in my protobuf definitions along with the official "protobuf" deserializer provided by Flink in version 1.16. It seems that the following message…
dv_
  • 11
  • 1
1
vote
0 answers

Failed to deserialize Avro record : Getting ArrayIndexOutOfBoundsException

I am trying to read from Kafka with Avro format using Pyflink My Program is this : from pyflink.datastream import StreamExecutionEnvironment from pyflink.datastream.connectors.kafka import FlinkKafkaConsumer from pyflink.datastream.formats.avro…
mjennet
  • 75
  • 1
  • 10
1
vote
1 answer

How to use a field in ROW type column in Flink SQL?

I'm executing a SQL in Flink looks like this: create table team_config_source ( `payload` ROW( `before` ROW( team_config_id int, ... ), `after` ROW( team_config_id int, ... …
Rinze
  • 706
  • 1
  • 5
  • 21
1
vote
0 answers

Table in parallel environment

If I have a table in parallel =2 environment , does it mean I have 2 separate tables? For example, I use a table for calculating the message (received from kafka) counts (and average values) for a window period (select count (x),avg(x) from…
Kenank
  • 321
  • 1
  • 10