1

We can see here some simple instructions on how to use tidyverse principles to wrangle data using bigquery using the R packages bigrquery and dbplyr.

This works by taking regular dplyr code and when the user calls %>% collect(), it translates the dplyr code into BigQuery's flavor of SQL, then executes the BigQuery code in BigQuery

I would like to know if I could use these packages to simply generate the raw BigQuery code, but not execute it?

What I am ultimately after is a way to generate BigQuery code from dplyr without actually using BigQuery (e.g. if working offline, for example)

What I know so far

I know it's possible to write dplyr code, call %>% collect() and view the BigQuery code that was generated/run in the GCP console in the browser. I would like the same code returned as a string in RStudio (and without it ever being executed)

stevec
  • 41,291
  • 27
  • 223
  • 311

1 Answers1

4

Instead of collect(), just type %>% show_query() at the end of your dplyr-code.

mnist
  • 6,571
  • 1
  • 18
  • 41
  • That sounds great. Do you know if this can work offline? I think perhaps not because it requires some pointer to a BigQuery table? – stevec Nov 24 '19 at 12:22
  • I tried it on `iris` and get `no applicable method for 'show_query' applied to an object of class "data.frame"`, which is understandable – stevec Nov 24 '19 at 12:24
  • 1
    no it hast to be a database connection which you apply your (db)dplyr code upon – mnist Nov 24 '19 at 12:49