For those who are interested, I opened an issue on github.
Consider the following two examples:
> library(data.table)
> iris <- as.data.table(iris)
> # option 1
> iris[, c('Species', paste0(c('Sepal.', 'Petal.'), 'Length'))]
Species Sepal.Length Petal.Length
1: setosa 5.1 1.4
2: setosa 4.9 1.4
3: setosa 4.7 1.3
4: setosa 4.6 1.5
5: setosa 5.0 1.4
---
146: virginica 6.7 5.2
147: virginica 6.3 5.0
148: virginica 6.5 5.2
149: virginica 6.2 5.4
150: virginica 5.9 5.1
> # option 2
> iris[, c('Species', grep('Length', names(iris), value = TRUE))]
[1] "Species" "Sepal.Length" "Petal.Length"
The J
expresssion is similar in option 1 and option 2, but the results are different. I know I can do it with the following way:
> # option 3
> x <- grep('Length', names(iris), value = TRUE)
> iris[, c('Species', ..x)]
Species Sepal.Length Petal.Length
1: setosa 5.1 1.4
2: setosa 4.9 1.4
3: setosa 4.7 1.3
4: setosa 4.6 1.5
5: setosa 5.0 1.4
---
146: virginica 6.7 5.2
147: virginica 6.3 5.0
148: virginica 6.5 5.2
149: virginica 6.2 5.4
150: virginica 5.9 5.1
However, I wonder why option 1 results in column selection while option 2 is evaluated into a character vector.
> sessionInfo()
R version 3.6.1 (2019-07-05)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.3 LTS
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/atlas/libblas.so.3.10.3
LAPACK: /usr/lib/x86_64-linux-gnu/atlas/liblapack.so.3.10.3
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] data.table_1.12.2
loaded via a namespace (and not attached):
[1] compiler_3.6.1 tools_3.6.1