dbplyr
has some very useful-looking simulation functions so you can write queries while not connected to any real database, but I can't seem to get actual table names into any of the queries I write that way. All their names are just `df`
, and I can't seem to modify them afterward. In fact, I don't see `df`
anywhere in the query object or its attributes (it doesn't have any), so now I have no idea how dbplyr
processes table names at all.
MWE:
library(dbplyr)
library(dplyr, warn.conflicts = FALSE)
library(magrittr)
library(purrr, warn.conflicts = FALSE)
query <- tbl_lazy(df = mtcars)
query %$% names(ops)
#> [1] "x" "vars"
show_query(query)
#> <SQL>
#> SELECT *
#> FROM `df`
# The actual data frame is stored in the object under the name `x`, but
# renaming it has no effect, unsurprisingly (since it wasn't named `df`
# anyway)
query %<>% modify_at("ops", set_names, "mtcars", "vars")
query %$% names(ops)
#> [1] "mtcars" "vars"
show_query(query)
#> <SQL>
#> SELECT *
#> FROM `df`
My use case, by the way, is that I need to run SQL queries in another system with actual server access, so I'd like to have R scripts that produce SQL syntax that's ready to run in that system, even though R can't connect to it. Making an empty dummy database with the structure of the real thing (table & column names, column types, but no rows) is an option, but, obviously, it'd be simpler to just use these free-form simulations, iff the SQL can be generated ready to cut and paste. (lazy_frame()
looked more appropriate for such non-existent tables, but, guess what, it's really just a wrapper for tbl_lazy(tibble())
, so, same exact `df`
name problem.)
Created on 2019-12-12 by the reprex package (v0.3.0)