1

I have to calculate the increments of a variable m for a time interval (t2-t1). Here is a dummy version of my data frame:

df <- expand.grid(m = do.breaks(c(1, 10), 5),
                  sample = c("A", "B", "C", "D"))
df$t <- rep(1:6, 4, ordered = TRUE)
df$d_m <- NA

what I'm trying to do is to populate df$d_m[i] with the difference between df$m[i+1] and df$m[i], also this has to be done within each level of sample. So this is my attempt, but is not successful at all.

delta_m <- function(m, t){
               for(i in 1:length(t)){
               df$d_m[i] <- m[i+1] - m[i]
               }}

df <- ddply(df, .(sample, t), transform, d_m = delta_m(m, t))

Where am I wrong?

Jilber Urbina
  • 58,147
  • 10
  • 114
  • 138
matteo
  • 645
  • 3
  • 10
  • 18
  • `do.breaks` I believe is from lattice. If this is correct then please add `library(lattice)` to your code so that people don't have to figure that out themselves. – Dason Nov 12 '12 at 16:30
  • What do you want to be the value for the last value of d_m in each sample? – Dason Nov 12 '12 at 16:33
  • 1
    There's probably easier ways to create a set of "breaks" in `m` anyway . Back to the question: won't `df$d_m <- diff(m)` suffice? – Carl Witthoft Nov 12 '12 at 16:41
  • Sorry Dason, did't relized that do.brakes is from lattice, and the last d_m should be NA – matteo Nov 12 '12 at 17:15
  • Carl, diff() is good, but I cannot make it work in ddply. – matteo Nov 12 '12 at 17:21
  • Also I would like to understand the "customized" function i'm tryng to build, beacuse I will have to use it to do other stuff. But thanks for the suggestion, I'm trynd to see if diff() would do the job – matteo Nov 12 '12 at 17:28

1 Answers1

0

Split the data up by sample:

sdf <- split(df, df$sample)

Then apply a function diff() (inside a transform to modify d_m inside each data frame component of sdf. Note that there is no t == 0 so that observation of d_m is NA:

sdf <- lapply(sdf, function(x) transform(x, d_m = c(NA, diff(m))))

Finally, combine the individual components back together

df <- do.call(rbind, sdf)

Which results in

> df
          m sample t d_m
A.A.1   1.0      A 1  NA
A.A.2   2.8      A 2 1.8
A.A.3   4.6      A 3 1.8
A.A.4   6.4      A 4 1.8
A.A.5   8.2      A 5 1.8
A.A.6  10.0      A 6 1.8
B.B.7   1.0      B 1  NA
B.B.8   2.8      B 2 1.8
B.B.9   4.6      B 3 1.8
B.B.10  6.4      B 4 1.8
B.B.11  8.2      B 5 1.8
B.B.12 10.0      B 6 1.8
C.C.13  1.0      C 1  NA
C.C.14  2.8      C 2 1.8
C.C.15  4.6      C 3 1.8
C.C.16  6.4      C 4 1.8
C.C.17  8.2      C 5 1.8
C.C.18 10.0      C 6 1.8
D.D.19  1.0      D 1  NA
D.D.20  2.8      D 2 1.8
D.D.21  4.6      D 3 1.8
D.D.22  6.4      D 4 1.8
D.D.23  8.2      D 5 1.8
D.D.24 10.0      D 6 1.8
Gavin Simpson
  • 170,508
  • 25
  • 396
  • 453
  • That's worked perfectly, the only problem is that now have to do the same, but instead of the difference between m[i+1] and m[i] I need their average.. – matteo Nov 12 '12 at 18:19