fast join data.table (potential bug, checking before reporting)

Question

This might be a bug. In that case, I will delete this question and report as bug. I would like someone to take a look to make sure I'm not doing something incorrectly so I don't waste the developer time.

test = data.table(mo=1:100, b=100:1, key=c("mo", "b"))
mo = 1
test[J(mo)]

That returns the entire test data.table instead of the correct result returned by

test[J(1)]

I believe the error might be coming from test having the same column name as the table which is being joined by, mo. Does anyone else get the same problem?

I can't explain the behavior, but fwiw: `foo=1; test[J(foo)]` has expected results. The same is true of `test[mo]` and `mo = data.table(1); test[mo]`. — Justin, Jan 08 '13 at 16:11
Also, `identical(test[J(1)], test[J(mo <- 1)])` gives `TRUE`. — Ryogi, Jan 08 '13 at 16:12

Josh O'Brien · Accepted Answer · 2013-01-08T18:25:12.887

This is a scoping issue, similar to the one discussed in data.table-faq 2.13 (warning, pdf). Because test contains a column named mo, when J(mo) is evaluated, it returns that entire column, rather than value of the mo found in the global environment, which it masks. (This scoping behavior is, of course, quite nice when you want to do something like test[mo<4]!)

Try this to see what's going on:

test <- data.table(mo=1:5, b=5:1, key=c("mo", "b"))
mo <-  1
test[browser()]
Browse[1]> J(mo)
#    mo
# 1:  1
# 2:  2
# 3:  3
# 4:  4
# 5:  5
# Browse[1]>

As suggested in the linked FAQ, a simple solution is to rename the indexing variable:

MO <- 1
test[J(MO)]
#    mo b
# 1:  1 6

(This will also work, for reasons discussed in the documentation of i in ?data.table):

mo <- data.table(1)
test[mo]
#    mo b
# 1:  1 6

score 4 · Answer 2 · answered Jan 08 '13 at 16:35

4

This is not a bug, but documented behaviour afaik. It's a scoping issue:

test[J(globalenv()$mo)]
   mo   b
1:  1 100

answered Jan 08 '13 at 16:35

Roland

127,288
10
191
288

fast join data.table (potential bug, checking before reporting)

2 Answers2