3

I have the following code:

main_cols <- c('num', 'let')
dt <- data.table(num = 1:5, let = letters[1:5])
dt

new_dt <- dt[CJ(num = num
                , let = let
                , unique = TRUE)
             , on = main_cols
             ]
head(new_dt, 10)

The thing is: I want to pass the columns to cross-join on as a vector. How do I “unpack” main_cols inside the CJ function? Thanks.

Anarcho-Chossid
  • 2,210
  • 4
  • 27
  • 44

1 Answers1

9

I think you'll want to use do.call, as @AnandaMahto suggested:

m = dt[, do.call(CJ, .SD), .SDcols=main_cols]
dt[m, on=main_cols]

You could also create m this way:

m = do.call(CJ, dt[,main_cols,with=FALSE])

If you have repeating values in the columns, use the unique option to CJ:

m = dt[, do.call(CJ, c(.SD, unique=TRUE)), .SDcols=main_cols]
# or 
m = do.call(CJ, c(dt[,main_cols,with=FALSE], unique=TRUE))
Frank
  • 66,179
  • 8
  • 96
  • 180
  • 1
    Can you explain why the need for `dt[m, on=main_cols]` which produce the same result as `m`? Thx. – fishtank Jan 12 '16 at 18:18
  • @fishtank In the OP's example, the two results are the same. If `dt` had some other columns besides those in `main_cols`, it would be different. Also, if some pairs of values in `m` either (i) did not appear in `dt` or (ii) appeared in multiple rows of `dt`, it would also be different (I think). Oh, forgot to mention: `dt[m,on=main_cols]` is a merge. – Frank Jan 12 '16 at 18:21
  • 1
    @Frank Thanks for the explanation. – fishtank Jan 12 '16 at 18:33
  • 1
    @Frank, thanks for the `unique` addendum. I do indeed have repeating values in one of the columns, and I was banging my head against the wall (I was passing `list`, not a vector of arguments) until I looked at your code. – Anarcho-Chossid Jan 14 '16 at 16:37