Questions tagged [flink-sql]

Apache Flink features two relational APIs, SQL and Table API, as unified APIs for stream and batch processing.

Apache Flink features two relational APIs:

  1. SQL (via Apache Calcite)
  2. Table API, a language-integrated query (LINQ) interface

Both APIs are unified APIs for stream and batch processing. This means that the a query returns the same result regardless whether it is applied on a static data set or a data stream. SQL queries are parsed and optimized by Apache Calcite (Table API queries are optimized by Calcite).

Both APIs are tightly integrated with Flink's DataStream and DataSet APIs.

667 questions
0
votes
1 answer

Having an equivalent to HOP_START inside an aggregation primitive in Flink

I'm trying to do an exponentially decaying moving average over a hopping window in Flink SQL. I need the have access to one of the borders of the window, the HOP_START in the following: SELECT …
0
votes
1 answer

An exponentially decaying moving average over a hopping window in Flink SQL: Casting time

Now we have SQL with fancy windowing in Flink, I'm trying to have the decaying moving average referred by "what will be possible in future Flink releases for both the Table API and SQL." from their SQL roadmap/preview 2017-03 post: table …
0
votes
1 answer

Compiler error on Registering a TemporalTableFunction as a Function

I'm following Flink's Defining Temporal Table Function example, and the compiler refuses to take that code: TemporalTableFunction rates = ratesHistory.createTemporalTableFunction("r_proctime", "r_currency"); tEnv.registerFunction("Rates",…
BenoitParis
  • 3,166
  • 4
  • 29
  • 56
0
votes
1 answer

Select Top N Per Group using Flink Sql

I am using Flink SQL to handle batch case. How can I get the top-n record per group using FLINK SQL?
Eric Zhang
  • 25
  • 4
0
votes
1 answer

Filtering Flink Table based on field of type Date

I created a table which has one field of type Date, namely f_date. One part of my desired query filter table rows based on the f_date field. So I did the followings: mytable.filter("f_date <= '1998-10-02'") and mytable.filter("f_date <=…
Soheil Pourbafrani
  • 3,249
  • 3
  • 32
  • 69
0
votes
1 answer

Flink how to create table with the schema inferred from Avro input data

I have loaded an Avro file in a Flink Dataset: AvroInputFormat test = new AvroInputFormat( new Path("PathToAvroFile") , GenericRecord.class); DataSet DS =…
Soheil Pourbafrani
  • 3,249
  • 3
  • 32
  • 69
0
votes
2 answers

Nested output of Flink

I am processing a Kafka stream using Flink SQL where every message is pulled from Kafka, processed using flink sql and pushed back into kafka. I wanted a nested output where input is flat and output is nested. Say for example my input is…
mjennet
  • 75
  • 1
  • 10
0
votes
1 answer

Flink sql for state checkpoint

When I use flink sql api process data. Restart app, sum result not save in checkpoint.It's still start with 1. final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment(); StateBackend stateBackend = new…
0
votes
2 answers

How to write data from flink pipeline to redis efficiently

I am building a pipeline in Apache flink sql api. The pipeline does simple projection query. However, I need to write the tuples (precisely some elements in the each tuple) once before the query and another time after the query. It turned out that…
0
votes
1 answer

Running Flink in Yarn

I'm running Flink(1.4.2) on Yarn. I'm using Flink Yarn Client for submitting the job to Yarn Cluster. Suppose I have a TM with 4 slots and I deploy a flink job with parallelism=4 with 2 container - 1 JM and 1 TM. Each parallel instance will be…
user3107673
  • 423
  • 4
  • 9
0
votes
1 answer

Watermarks in a RichParallelSourceFunction

I am implementing a SourceFunction, which reads Data from a Database. The job should be able to be resumed if stopped or crushed (i.e savepoints and checkpoints) with the data being processed exactly once. What I have so…
ScalaNewbie
  • 173
  • 3
  • 12
0
votes
1 answer

Flink SQL : run out of memory for joining tables

I have a MySql table updated frequently. I want to take a snapshot for each id which are updated in the past 20 seconds and write the value into a redis. I use the binlog as streaming input and transform the datastream into a Flink table. I run the…
lgbo
  • 221
  • 3
  • 14
0
votes
3 answers

Flink: Could not find a suitable table factory for 'org.apache.flink.table.factories.DeserializationSchemaFactory' in the classpath

I am using flink's table api, I receive data from kafka, then register it as a table, then I use sql statement to process, and finally convert the result back to a stream, write to a directory, the code looks like this: def main(args:…
Clay4megtr
  • 21
  • 1
  • 1
  • 4
0
votes
1 answer

Flink distributes tasks to one taskmanager until slots is full

I have a flink cluster with 5 nodes. And each nodes have 8 slots. I am using Flink 1.5.2. If there are N tasks, the problem is that: If N <= 8, all tasks will assign to node1. New tasks will be assigned to node2 until it is full.etc. And the other…
Longxing Wei
  • 171
  • 2
  • 17
0
votes
1 answer

Flink SQL: Repeating grouping keys in result of GROUP BY query

I want to do a simple query in Flink SQL in one table which include a group by statement. But in the results there are duplicate rows for the column specified in the group by statement. Is that because I use a streaming environment and it doesn't…
Gatsby
  • 365
  • 1
  • 5
  • 17