2

Example code:

library(DBI)
library(RSQLite)

# will require more details (like user, password, host, port, etc.)
con <- dbConnect(RSQLite::SQLite(), ":memory:")
data(USArrests)
dbWriteTable(con, "USArrests", USArrests)
dbListTables(con)

d0 <- tbl(con, "USArrests")
dbGetQuery(d0, "select * from USArrests")
dbGetQuery(d0, "select * from d0")

Which returns:

Error in (function (classes, fdef, mtable)  : 
  unable to find an inherited method for function ‘dbGetQuery’ for signature ‘"tbl_dbi", "character"’

Obviously I could just use dbGetQuery on con, but was wondering if there is a way to have it work on d0 directly.

Thanks.

Tal Galili
  • 24,605
  • 44
  • 129
  • 187
  • Not `tbl_sql` or `tbl_dbi` but the sqldf package can process data frames using SQL and a variety of backends including sqlite. `library(sqldf); sqldf("select * from USArrests limit 3")` – G. Grothendieck Feb 11 '18 at 13:23

3 Answers3

3

On the one hand, is the tbl() function which is a SQL statement of type SELECT * FROM Table this works over DBMS and to retrieve data from database server it need a pull function like collect(). On the other hand, is the function dbGetQuery() which retrieves the data into R session using a SQL query. Both requires a connection to server and a statement, but the first create the statement using SQL translations and the other is the user which write the SQL query.

To illustrate I will use a temporal table tmp in a postgreSQL DBMS:

# Example on postgreSQL
library(tidyverse)
library(dbplyr)
library(RPostgreSQL)
library(DBI) # This is loaded with RPostgreSQL package

con <- dbConnect(PostgreSQL(), 
                 dbname="test",
                 host="localhost",
                 port=5432,
                 user="user",
                 password="pass")

Dummy data to PostgreSQL server

date <- data_frame(Col1 = c("20180212", "20180213"),
                   Col2 = c("A", "B"))
dbWriteTable(con, "tmp", date, temporary = TRUE)

With tbl() function

tbl(con, "tmp") %>% show_query()

#><SQL> 
#>SELECT * 
#>FROM "tmp"

tbl(con, "tmp") %>% 
  mutate(date = to_date(Col1, "YYYYMMDD")) %>%
  show_query()

#><SQL>
#>SELECT "row.names", "Col1", "Col2", TO_DATE("Col1", 'YYYYMMDD') AS "date"
#>FROM "tmp"

tbl(con, "tmp") %>% 
  mutate(date = to_date(Col1, "YYYYMMDD")) %>% #this works on DBMS
  collect() %>% #This retrive to R session
  str()

#>Classes ‘tbl_df’, ‘tbl’ and 'data.frame': 2 obs. of  3 variables:
#> $ row.names: chr  "1" "2"
#> $ Col1     : chr  "20180212" "20180213"
#> $ Col2     : chr  "A" "B"
#> $ date     : Date, format: "2018-02-12" "2018-02-13"

With dbGetQuery() function

dbGetQuery(con, "SELECT * FROM tmp") %>% 
  str()

#>'data.frame': 2 obs. of  3 variables:
#> $ row.names: chr  "1" "2"
#> $ Col1     : chr  "20180212" "20180213"
#> $ Col2     : chr  "A" "B"

dbGetQuery(con, "SELECT * FROM tmp") %>%
  mutate(date = as.Date(Col1, format = "%Y%m%d")) %>% #This works on R session
  str()

#>'data.frame': 2 obs. of  4 variables:
#> $ row.names: chr  "1" "2"
#> $ Col1     : chr  "20180212" "20180213"
#> $ Col2     : chr  "A" "B"
#> $ date     : Date, format: "2018-02-12" "2018-02-13"

Conclusion

tbl() function is a high level over dbGetQuery() in R-programming. Consider to re-design your code chain understanding the differences between both functions and best uses for these.

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
gavg712
  • 300
  • 1
  • 10
1

No, you can't use dbGetQuery with dplyr in this way, it only work with DBIConnection.

Also, your first query is redundany, d0 already represent the USArrests data and second query is nonsense.

dplyr is using a bit different approach, it uses dplyr verbs and creating an SQL query:

d0 %>% filter(Murder > 10) %>% show_query()
m0nhawk
  • 22,980
  • 9
  • 45
  • 73
  • Hi @m0nhawk. I know that it doesn't work, the question is how can I write something like "filter". It obviously runs some SQL, I want to be able to write a customized function that does it as well. – Tal Galili Feb 11 '18 at 12:42
  • 1
    @TalGalili I don't understand what you want. You want to write your own `filter`? Then you can check the `dplyr` development guides. – m0nhawk Feb 11 '18 at 14:07
1

ok, looking at the str helped reveal how to access the needed elements using the following commands:

  con <- d0$src$con
  db_name <- db_list_tables(con)[1]
Tal Galili
  • 24,605
  • 44
  • 129
  • 187
  • This looks interesting. Could you explain what it is doing? – Rasmus Larsen Feb 14 '18 at 07:20
  • 1
    Hi Rasmus, it is getting the original connection and used database name from the dbplyr object (tbl_lazy, tbl_sql). This way we could directly use things like dbGetQuery on it without first going through the dplyr collect function. I personally wanted it for getting a sample of the rows - but it makes sense only if no other dplyr verbs have been used on the object. – Tal Galili Feb 15 '18 at 05:20