Questions tagged [google-bigquery]

Google BigQuery is a Google Cloud Platform product providing serverless queries of petabyte-scale data sets using SQL. BigQuery provides multiple read-write pipelines, and enables data analytics that transform how businesses analyze data.

Google BigQuery is a web service that lets you do interactive analysis of massive datasets—up to billions of rows. Scalable and easy to use, BigQuery lets developers and businesses tap into powerful data analytics on demand.

Official sites:

Other sites for related topics and discussion:

25130 questions
4
votes
1 answer

Google Pub/Sub to Dataflow, avoid duplicates with Record ID

I'm trying to build a Streaming Dataflow Job which read events from Pub/Sub and write them into BigQuery. According to the documentation, Dataflow can detect duplicate messages delivery if a Record ID is used (see:…
4
votes
0 answers

Downloading Bigish Datasets with bigrquery - Best practices?

I'm trying to download a table of about 250k rows and 500 cols from bigquery into R for some model building in h2o using the R wrappers. It's about 1.1gb when downloaded from BQ. However, it runs for a long time and then looses the connection so…
andrewm4894
  • 1,451
  • 4
  • 17
  • 37
4
votes
1 answer

com.google.cloud.bigquery.BigQueryException: Read timed out

I am querying data from BigQuery here is my code: import com.google.cloud.bigquery.*; public static JSONArray query(String tableId, String field, String val) throws Exception{ String queryString = "SELECT * FROM `" + tableId +"` where " +…
dina
  • 4,039
  • 6
  • 39
  • 67
4
votes
3 answers

Fetching data from large BigQuery table in python

What I have is a BigQuery table(>5mil rows). I need to fetch this data in batches and process it inside AppEngine, python. The only way to fetch from a table that I know is to run SELECT query on this table and then iterate the result using tokens…
Andrei Ivasiuc
  • 228
  • 1
  • 5
  • 10
4
votes
1 answer

Passing ARRAY of STRUCTs into user-defined function for standard BigQuery SQL

How would I got about passing an ARRAY of STRUCTS into my user-defined function (using standard SQL)? Firstly, a bit of context: Table schema: id STRING customer STRING request STRUCT< headers STRING body STRING url STRING > response STRUCT< …
garbetjie
  • 579
  • 3
  • 10
4
votes
1 answer

Google BigQuery DML - Error while trying to run basic UPDATE

I am trying to run the following basic UPDATE statement UPDATE [gcp-or:babynames.names_2014] SET name = "Emma B" WHERE name = "Emma" However, I am getting the following error: "Query Failed Error: 1.1 - 1.76: Unrecognized token UPDATE" Error Any…
4
votes
1 answer

BigQuery Illegal Escape Sequence for backslash /

I am using UDF to do a little regex for phrase like 'test/test' and I came across an error that I can't correct. CREATE TEMPORARY FUNCTION parseMethod(queryString STRING) RETURNS STRING LANGUAGE js AS \"\"\" var match_regex = /test\/(\w+)/i; var…
dorachan2010
  • 981
  • 3
  • 12
  • 21
4
votes
1 answer

Extracting Firebase / BigQuery DAUs, WAUs and MAUs

I don’t want to over complicate this question, so I will try to ask it as clear as possible to avoid confusion. The outcome I require is two-fold. I want to determine the DAUs, WAUs, and MAUs for a Mobile App within: a) Google Analytics, as well…
d_-
  • 1,391
  • 2
  • 19
  • 37
4
votes
1 answer

Big Query : Cast float64 to date in SQL Standard

I would like to cast float64 to date/datetime? I have a field with content 1.483653436E9 how could i convert it to date? Thanks
gabriel.almeida
  • 125
  • 1
  • 2
  • 11
4
votes
2 answers

How to measure language popularity via Github Archive data?

I'm attempting to measure programming language popularity via: The number of stars on repos in combination with... The programming languages used in the repo and... The total bytes of code in each language (recognizing that some languages are…
Abe
  • 156
  • 5
  • 17
4
votes
1 answer

Is there a way of deleting old partitions in a partitioned table using bigquery API?

I have a daily partitioned table, and I want to delete older partitions by API. The documentation only says that older partitions which are not updated for 3 months are stored with 50% discount. Thanks Google, but I really do not intend to keep…
J Doe
  • 41
  • 1
  • 2
4
votes
4 answers

Error importing Google Cloud Bigquery api module in python app

I am trying to import bigquery into my python app from google.cloud import bigqueryand run it locally with dev_appserver.py, but I receive an error: File…
Andrei Ivasiuc
  • 228
  • 1
  • 5
  • 10
4
votes
1 answer

Self join in big query is running very slowly, am I following best practices?

I'm creating a table of the number of overlapping commentators between Reddit subreddits via the following self join: SELECT t1.subreddit, t2.subreddit, COUNT(*) as NumOverlaps FROM [fh-bigquery:reddit_comments.2015_05] t1 JOIN…
Trevor M.
  • 43
  • 1
  • 4
4
votes
1 answer

Proguard configuration for Big query

I have implemented big query in my project using following gradle file compile ('com.google.apis:google-api-services-bigquery:v2-rev328-1.22.0'){ exclude module: 'httpclient' //by artifact name exclude group: 'org.apache.httpcomponents'…
4
votes
2 answers

Any JDBC Driver for Google BigQuery Standard SQL

I need a JDBC driver to connect my application to Google BigQuery. I tried CData JDBC driver, but it did not support all types of Standard SQL queries. Are there any other complete options?
Hajar Homayouni
  • 560
  • 2
  • 6
  • 16
1 2 3
99
100