0

Sorry if this question has been asked elsewhere, I can't find it. I'm working through some basic examples in MonetDBLite.

> dbGetQuery(dbcon, "SELECT MAX(mpg) FROM mtcars WHERE cyl = 8")
L3
1 19.2

works, but

> ms <- MonetDBLite::src_monetdblite("./DB")
> t <- tbl(ms, "mtcars")
Error in UseMethod("tbl") : 
no applicable method for 'tbl' applied to an object of class
"c('src_monetdb', 'src_sql', 'src')"

It seems that it's trying to assign the db to t not the table.

Any suggestions would be greatly appreciated.

I've been perusing resources and found a useR2016 presentation and noticed a difference here:

> ms
src:  MonetDBEmbeddedConnection
tbls: mtcars

Curious...

nzgwynn
  • 81
  • 1
  • 10

3 Answers3

2

I'm a huge fan of using MonetDBLite together with dplyr. My addition to Hannes Mühleisen's (thanks for the package!) answer would be that it appears that the order you load the packages can matter. Loading MonetDBLite after dplyr and dbplyr seems to be the key for me. Loading MonetDBLite first causes errors similar to the one nzgwynn noted.

Sometimes I could connect to the database with no problems. Other times I would get error messages like:

Error in UseMethod("db_query_fields") : no applicable method for 'db_query_fields' applied to an object of class "MonetDBEmbeddedConnection"

Like nzgwynn, I was puzzled about why it would work sometimes but not others. Restarting and reinstalling wouldn't necessarily fix it for me.

This clue, from an issue filed about sparklyr, lead me to explore the package loading order:

https://github.com/rstudio/sparklyr/issues/38

Like noted there with sparklyr, and I've noticed with other R database packages, MonetDBLite will load and attach automatically if the Global Environment already contains a connection object. My problem was that I had an src_monetdb object in my workspace, which was causing MonetDBLite to load upon starting RStudio. So I while I thought I was loading it after dplyr and dbplyr, it was really loading first. If I clear the workspace and then restart, I can load the packages in the preferred order. So far, this method has worked.

I've seen starting with a clean workspace advised as good practice generally, e.g.: https://twitter.com/hadleywickham/status/561146907519500288. Starting with a fresh workspace loses you no time either given MonetDBLite's speedy query ability.

Lastly, I would put a enthusiastic pitch in for using MonetDBLite. I saw it mentioned on RStudio's database page and was immediately impressed on how easy it was to setup and how fast it is. It's the best way I've found for working with a ~2 GB dataset in R. When exploring the data interactively, the dplyr queries run so quickly that it feels like I'm working with the data in memory. And if all I want to do is load the whole dataset into memory, MonetDBLite is as fast or faster than other methods I've tried like read.fst() from the fst package.

Corey N.
  • 159
  • 11
  • Thanks for the thorough answer!! Do you have any idea why the order that packages are loaded is important? What does this do to the environment? Also, if I have the data (D) in memory, then use D<- data.table(D) and run it through tidyverse it works quickly too. – nzgwynn Nov 14 '18 at 20:42
  • Hi @nzgwynn, I'm glad you posted the question because I don't think I would have got it working otherwise. Why the order matters, I don't know... I'm just a grateful user of all this great free technology. In reading about similar issues, my guess is that there is some conflict between the packages. Where packages use the same name for a function or object, the package that loads later "masks" the objects loaded before it. So that was my working theory--loading `MonetDBLite` last would give it precedence over the other packages. Yet that theory could be off... – Corey N. Nov 14 '18 at 22:04
  • I spoke with someone who said that dplyer and dbplayer change the pointers around which can make it difficult for R (and other packages) to find things. – nzgwynn Jan 16 '19 at 20:15
0

You need to call library("dplyr") before using tbl and friends. Also make sure you have dbplyr installed.

Update: Also, please make sure there is no connection object (src) in a stored workspace loaded at startup. Loading connections from .Rdata files does not work! Instead, create the connection/src from scratch every time you run a script.

Hannes Mühleisen
  • 2,542
  • 11
  • 13
  • They are! '> library("dbplyr") > library("dplyr") > mt <- tbl(ms, "mtcars") Error in UseMethod("tbl") : no applicable method for 'tbl' applied to an object of class "c('src_monetdb', 'src_sql', 'src')" ' – nzgwynn Oct 11 '18 at 14:35
  • It works perfectly on my Ubuntu VM, but not locally on my Windows10. Why would that be? – nzgwynn Oct 11 '18 at 15:21
  • OMG!! I closed R and started it again and it worked!! WTF?? I don't understand why it didn't work before though... – nzgwynn Oct 11 '18 at 15:43
0

I closed R and opened it again and the same coding worked fine...

nzgwynn
  • 81
  • 1
  • 10