Questions tagged [beam-sql]

BeamSQL is built on top of Apache Beam Java SDK, as a relational API for unified batch and streaming data processing.

BeamSQL features

Connect heterogeneous storage systems - access data from different systems with ease.
Pure SQL pipelines in SQL shell - lower the barrier to write data processing pipelines.
Embedded SQL in pipelines - more flexibility and productivity.
Unified bath and streaming semantics - towards one SQL for batch, streaming and mixed use cases.

Resources

Apache Beam SQL website

45 questions

votes

1 answer

How do I use Apache Beam to trigger an aggregation based on a new incoming event?

Problem: I'm building a mobile game app with real-time scoring. Each time a player performs an action, it sends a message to Pub/Sub with the following keys: {"event_ts","event_id", "player_id", "score"} As soon as the Pub/Sub message is received, I…

asked Mar 15 '22 at 23:35

Sid

votes

2 answers

Exception while writing multipart empty csv file from Apache Beam into netApp Storage Grid

Problem Statement We are consuming multiple csv files into pcollections -> Apply beam SQL to transform data -> write resulted pcollection. This is working absolutely fine if we have some data in all the source pCollections and beam SQL generates new…

apache-beam apache-beam-io netapp beam-sql apache-beam-internals

asked Feb 03 '22 at 10:17

Jaysukh Kalasariya

votes

1 answer

Dataflow / Beam Accumulator coder

I am developing a Dataflow pipeline that uses the SqlTransform Library and also the beam aggregation function defined in org.apache.beam.sdk.extensions.sql.impl.transform.agg.CountIf . Here a slide of code: import…

java google-cloud-dataflow apache-beam dataflow beam-sql

asked Oct 26 '21 at 09:47

Pato Navarro

votes

1 answer

How to output nested Row from Beam SQL (SqlTransform)?

I want to have Row with nested Row from output of Beam SQL (SqlTransform), but failing. Questions: What is the proper way to output Row with nested Row from SqlTransform? (Row type is described in the docs, so I believe it's supported) If this is a…

apache-beam beam-sql

asked Sep 18 '21 at 06:23

Shinichi TAMURA

votes

2 answers

TypeError: expected bytes, str found [while running 'Writing to DB/ParDo(_WriteToRelationalDBFn) while writing to db from using beam-nuggets

@mohaseeb I am trying below example to write data from pub\sub to postgresql.Getting below error while writing pub\sub data into postgresql. "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/result.py", line 545, in…

python python-3.x google-cloud-platform apache-beam beam-sql

asked Jan 23 '21 at 17:47

Ramesh3076

votes

1 answer

How to cast int to boolean when doing SQL transform in Apache Beam

I'm trying to do a SQL transform with Apache Beam using Calcite SQL syntax. I'm doing an int to boolean cast. My sql looks like this: ,CASE WHEN cast(IsService as BOOLEAN) THEN CASE WHEN IsEligible THEN 1 ELSE 0 END ELSE NULL END AS Reported Where…

java sql apache-beam apache-calcite beam-sql

asked Jan 08 '21 at 01:52

artofdoe

votes

2 answers

How to specify BeamSQL UDF for Numeric Types

I'm trying to add a User Defined Function (UDF) to a SqlTransform in a Beam pipeline, and the SQL parser doesn't seem to understand the function's type. The error i get is: No match found for function signature IF(, ,…

apache-beam beam-sql

asked Nov 16 '20 at 22:49

Mark P Neyer

1,009
2
8
19

votes

1 answer

How to integrate Beam SQL windowing query with KafkaIO?

First, we have a kafka input source in JSON format: {"event_time": "2020-08-23 18:36:10", "word": "apple", "cnt": 1} {"event_time": "2020-08-23 18:36:20", "word": "banana", "cnt": 1} {"event_time": "2020-08-23 18:37:30", "word": "apple", "cnt":…

java apache-kafka apache-beam beam-sql

asked Aug 24 '20 at 17:40

wumrwds

votes

0 answers

Beam SQL CURRENT_TIMESTAMP

My Unix Spark Server timezone is CDT but when I'm running Beam SQL CURRENT_TIMESTAMP as below it is always coming as UTC. I tried locally also but it is always displaying UTC. I want this to be CDT same as server zone in CURRENT_TIMESTAMP function.…

apache-beam beam-sql

asked Aug 12 '20 at 12:09

Syed Mohammed Mehdi

votes

2 answers

How to select a set of fields from input data as an array of repeated fields in beam SQL

Problem Statement: I have an input PCollection with following fields: { firstname_1, lastname_1, dob, firstname_2, lastname_2, firstname_3, lastname_3, } then I execute a Beam SQL operation such that output of…

google-cloud-dataflow apache-beam apache-beam-io apache-calcite beam-sql

asked May 29 '20 at 14:37

Spiriter_rider

votes

1 answer

row_number in Apache Beam SQL

I'm trying to generate row_number using Apache Beam SQL with below code: PCollection rwrtg = PCollectionTuple.of(new TupleTag<>("trrtg"), rrtg) .apply(SqlTransform.query("select appId, row_number() over…

apache-beam beam-sql

asked Apr 19 '20 at 12:24

Syed Mohammed Mehdi

votes

1 answer

How can I increase the thread stack size on Apache Beam pipeline workers with Google Cloud Dataflow?

I'm getting a StackOverflowError on my Beam workers due to running out the thread stack, and because it's deep within the running of a SqlTransform it's not straightforward to reduce the number of calls being made. Is it possible to change the JVM…

java google-cloud-dataflow apache-beam beam-sql

asked Dec 04 '19 at 18:19

wrp

votes

2 answers

Errors trying to start ZetaSQL planner

I'm trying to run a Beam pipeline with SQL transforms, parsed with ZetaSQL. I begin with setting options with options.setPlannerName("org.apache.beam.sdk.extensions.sql.zetasql.ZetaSQLQueryPlanner"); When I try creating my SqlTransform with any…

java google-cloud-platform apache-beam beam-sql

asked Dec 03 '19 at 21:40

wrp

votes

2 answers

Apache beam get kafka data execute SQL error:Cannot call getSchema when there is no schema

I will input data of multiple tables to kafka, and beam will execute SQL after getting the data, but now there are the following errors: Exception in thread "main" java.lang.IllegalStateException: Cannot call getSchema when there is no schema …

apache-beam apache-beam-io beam-sql

asked Nov 22 '19 at 10:17

smarctor

votes

1 answer

ZetaSQL Sample Using Apache beam

I am Facing Issues while Using ZetaSQL in Apache beam Framework (2.17.0-SNAPSHOT). After Going through documentation of the apache beam I am not able to find any sample for ZetaSQL. I tried to add the Planner: …

java google-cloud-platform apache-beam beam-sql

asked Oct 15 '19 at 08:21

BackBenChers

Prev 1

3 Next