How can I use data.table syntax to produce a data.table where each column contains the differences between the column of the original data.table and the next column?
Example: I have a data.table where each row is a group, and each column is surviving population after year 0, after year 1, 2, etc. Such as:
pop <- data.table(group_id = c(1, 2, 3),
N = c(4588L, 4589L, 4589L),
N_surv_1 = c(4213, 4243, 4264),
N_surv_2 = c(3703, 3766, 3820),
N_surv_3 = c(2953, 3054, 3159) )
# group_id N N_surv_1 N_surv_2 N_surv_3
# 1 4588 4213 3703 2953
# 2 4589 4243 3766 3054
# 3 4589 4264 3820 3159
(Data types differ because N is a true integer count and N_surv_1, etc. are projections that could be fractional.)
What I have done: using the base diff
and matrix transposition, we can:
diff <- data.table(t(diff(t(as.matrix(pop[,-1,with=FALSE])))))
setnames(diff, paste0("deaths_",1:ncol(diff)))
cbind(group_id = pop[,group_id],diff)
# produces desired output:
# group_id deaths_1 deaths_2 deaths_3
# 1 -375 -510 -750
# 2 -346 -477 -712
# 3 -325 -444 -661
I know that I can use base diff
by group on a single column produced by melt.data.table
, so this works but ain't pretty:
melt(pop,
id.vars = "group_id"
)[order(group_id)][, setNames(as.list(diff(value)),
paste0("deaths_",1:(ncol(pop)-2)) ),
keyby = group_id]
Is that the most data.table-riffic way to do this, or is there a way to do it as a multi-column operation in data.table?