I need to efficiently "unroll" a list column in an R data.frame. For example, if I have a data.frame defined as:
dbt <- data.frame(values=c(1,1,1,1,2,3,4),
parm1=c("A","B","C","A","B","C","B"),
parm2=c("d","d","a","b","c","a","a"))
Then, assume an analysis that generates one column as a list, similar to the following output:
agg <- aggregate(values ~ parm1 + parm2, data=dbt,
FUN=function(x) {return(list(x))})
The aggregated data.frame looks like (where class(agg$values) == "list"):
parm1 parm2 values
1 B a 4
2 C a 1, 3
3 A b 1
4 B c 2
5 A d 1
6 B d 1
I'd like to unroll the "values" column , repeating the parm1 & 2 values (adding more rows) in an efficient manner for each element of the list over all the data.frame rows.
At the top level I wrote a function that does the unroll in a for loop called in an apply. It's really inefficient, (the aggregated data.frame takes about an hour to create and nearly 24 hours to unroll, the fully unrolled data has ~500k records). The top level I'm using is:
unrolled.data <- do.call(rbind, apply(agg, 1, FUN=unroll.data))
The function just calls unlist() on the value column object then builds a data.frame object in a for loop as the returned object.
The environment is somewhat restricted and the tidyr, data.table and splitstackshape libraries are unavailable to me, it needs to not only be functions found in base:: but limited to those available in v3.1.1 and before. Thus the answers in this (not really a duplicate) question do not apply.
Any suggestions on something faster?
Thanks!