1

I ran into a problem that I have not encountered before. When I load the data.table package (version 1.9.4) alone, and then try to subset a dataset to remove a variable I get no issues. However, when I load plyr (version 1.8.2) and dplyr (version 0.4.1) as well, I get the following error (tried the same with a toy dataset as well). Note that the original file is in Excel and I use library(readxl) to read the file into a RData file format (the file,vahere.RData is available here (https://goo.gl/kzI5bD). The file as three variables - LINK_ID (numeric), TMC (character), MPORegion (character). The error I get is:

Error in `[.tbl_df`(x, r, vars, with = FALSE) : 
unused argument (with = FALSE)

I don't remember encountering this error before. If anyone has any insights on what is going on I would really appreciate it. I tried it on two separate machines (Windows 7) and get the same error. The Sys.Info from both machines are below:

Machine 1 - sysname "Windows", release "7 x64", version "build 7601, Service Pack 1", machine "x86-64"
Machine 2 - sysname "Windows", release "7 x64", version "build 7601, Service Pack 1", machine "x86-64"

Below is the history of the run.

> library(data.table)
  data.table 1.9.4  For help type: ?data.table
  *** NB: by=.EACHI is now explicit. See README to restore previous behaviour.
> load("vahere.RData")
> vahere[is.na(vahere)] <- "RestofVA"
> vahere <- setDT(vahere)
> 
> # Drop link id and identify unique tmc to region
> uniqtmcs <- subset(vahere,select=-c(1))
> library(plyr)
> library(dplyr)
Attaching package: ‘dplyr’
The following objects are masked from ‘package:plyr’:
arrange, count, desc, failwith, id,
mutate, rename, summarise, summarize
The following objects are masked from ‘package:data.table’:
between, last
The following object is masked from ‘package:stats’:
filter
The following objects are masked from ‘package:base’:
intersect, setdiff, setequal, union

> rm(vahere)
> load("vahere.RData")
> vahere[is.na(vahere)] <- "RestofVA"
> vahere <- setDT(vahere)
> 
> # Drop link id and identify unique tmc to region
> uniqtmcs <- subset(vahere,select=-c(1))
Error in `[.tbl_df`(x, r, vars, with = FALSE) : 
unused argument (with = FALSE) 
Krishnan
  • 1,265
  • 2
  • 13
  • 24
  • Just use `setDT(vahere)` don't `<-` assign the result, by the way. It might be clearer if we knew what `vahere` was to begin with. Generally, `data.table` objects have better methods of subsetting than `subset`... – Frank May 15 '15 at 02:43
  • I also see the error. If you look at ```getAnywhere(`subset.data.table`)```, you'll see the line `ans <- x[r, vars, with = FALSE]`. This calls `[`, thinking that this `data.table` syntax will work, but it does not because it actually defers to `[.tbl_df` (from dplyr), which does not have a `with` argument. (I'm saying it's a bug, since these things should play nice together.) – Frank May 15 '15 at 03:07
  • Maybe not a bug. I get a warning that your data is "corrupted" after `setDT` and cannot reproduce it with normal data. – Frank May 15 '15 at 03:17
  • Frank..thanks for the insights and comments. Hmm...I did not get the corrupted file warning. Will check into it. That seems to be the most logical reason because I have used data.table and dplyr in combination in the past (as in the last 3 days or so) and have not run into these issues nor has there been any change in hardware on both machines. – Krishnan May 15 '15 at 03:19
  • 1
    It was a corrupted file and now there are no issues reading the file. – Krishnan May 15 '15 at 03:28
  • 1
    It's rather interesting that `dplyr` and `data.table` exhibit this incongruity. I thought `dplyr` was designed to expect data.table as a back-end? – IRTFM May 15 '15 at 04:09

1 Answers1

0

The input file was corrupted and I apologize for posting on the forum. I did not get any warnings about the file being corrupted and running with another toy dataset gave the exact same error.

Krishnan
  • 1,265
  • 2
  • 13
  • 24