This thread has discussed about doing it for data frame. I want to do a little more complicated than that:
dt <- data.table(A = c(rep("a", 3), rep("b", 4), rep("c", 5)) , B = rnorm(12, 5, 2))
dt2 <- dt[order(dt$A, dt$B)] # Sorting
# Always shows the factor from A
do.call(rbind, by(
dt2, dt2$A,
function(x) data.table(A = x[,A][1], B = x[,B][4])
)
)
#This is to reply to Vlo's comment below. If I do this, it will return both row as 'NA'
do.call(rbind,
by(dt2, dt2$A, function(x) x[4])
)
# Take the max value of B according to each factor A
do.call(rbind, by(dt2, dt2$A,
function(x) tail(x,1))
)
)
What are more efficient way(s) to do this with data.table
native functions?