So what I'm trying to achieve is this : Say I have a data table dt having (say) 4 columns. I want to get unique length of every combination of 2 columns.
DT <- data.table(a = 1:10, b = c(1,1,1,2,2,3,4,4,5,5), c = letters[1:10], d = c(3,3,5,2,4,2,5,1,1,5))
> DT
a b c d
1: 1 1 a 3
2: 2 1 b 3
3: 3 1 c 5
4: 4 2 d 2
5: 5 2 e 4
6: 6 3 f 2
7: 7 4 g 5
8: 8 4 h 1
9: 9 5 i 1
10: 10 5 j 5
I tried the following code :
cols <- colnames(DT)
for(i in 1:(length(cols)-1)) {
for (j in i+1:length(cols)) {
print(unique(DT[,.SD, .SDcols = c(cols[i],cols[j])]))
}
}
Here, basically 'i' goes from first column to second last whereas 'j' is the combining column with 'i'. So the combinations I get are : ab, ac, ad, bc, bd, cd.
But it gives me the following error
Error in
[.data.table
(DT, , .SD, .SDcols = c(cols[i], cols[j])) : .SDcols missing at the following indices: [2]
If someone can explain why this is and a way around it, I'll be really grateful. Thanks.