2

I'm trying to run rollapply from the bottom of my data.frame up to the top of my data.frame. Basically the last row in the data.frame (RBH) is the final measurement for a given subject in 2012. I then need to subtract each annual measurement in the previous years to calculate what the individual's size would have been each year prior.

Sample data.frame:

df1 <- structure(c(1.62, 3.96, 4.89, 6.61, 8.79, 
                   57.15, 2.43, 5.58, 7.2, 9.3, 
                   11.87, 66.6, 1.47, 3.49, 4.32,
                   NA, NA, 60.75),
                 .Dim = c(6L, 3L),
                 .Dimnames = list(c("2008", "2009", "2010","2011", "2012","RBH"),
                                  c("Tree001", "Tree002", "Tree003")))

Intended output:

Tree001 <- c(31.28, 32.90, 36.86, 41.75, 48.36, 57.15)    
Tree002 <- c(29.62, 32.05, 37.63, 44.83, 54.13, 66.00)    
Tree003 <- c(51.47, 52.94, 56.43, 60.75, NA, NA)    
df2 <- data.frame(Tree001, Tree002, Tree003)    
rownames(df2) <- 2007:2012    

I've tried running rollapply backwards from a suggestion I found at Rollapply() backwards in R , but I didn't get the intended output. It came out as a list instead of a data.frame, and subtracted each value from the current cell, not from the running value.

Code I tried:

if ( !require(zoo) ) print(" Need pkg:zoo for rollapply")
df3 <- rollapply(df1[length(df1):1], width=2, diff, fill=NA, partial=T)    
df3    
 [1]     NA     NA     NA  -0.83  -2.02  65.13 -54.73  -2.57  -2.10  -1.62    
[11]  -3.15  54.72 -48.36  -2.18  -1.72  -0.93  -2.34     NA

Any suggestions would be appreciated.

Community
  • 1
  • 1
KKL234
  • 367
  • 1
  • 5
  • 23
  • I think you are just missing a comma. Try: `df3 <- rollapply(df1[length(df1):1 , ], ....)` – IRTFM Jul 30 '14 at 17:04
  • You are missing a comma, but that still won't work. `diff` does not do what you want here (it also doesn't make sense to use it with rollapply) – Señor O Jul 30 '14 at 17:06
  • 1
    The other error that gets expsosed is the use of "length` when `nrow` was appropriate. The input was a matrix and the "length" of a matrix is not the number of rows. (And if it were a data.frame, tehn the "length" is also not the number of rows but rather the number of columns. – IRTFM Jul 30 '14 at 17:24

1 Answers1

2

This is closer to what you want:

df1[is.na(df1)] = 0 ##This is how you're actually treating it!
df1 = data.frame(df1)


> df2 = apply(df1[nrow(df1):1,], 2, function(x) c(x[1], x[1]-cumsum(x[-1])))
> df2 = df2[nrow(df2):1,]
> df2
     Tree001 Tree002 Tree003
2008   31.28   30.22   51.47
2009   32.90   32.65   52.94
2010   36.86   38.23   56.43
2011   41.75   45.43   60.75
2012   48.36   54.73   60.75
RBH    57.15   66.60   60.75
Señor O
  • 17,049
  • 2
  • 45
  • 47
  • I appreciate the quick response! This works well for obtaining values for Tree001 and Tree002, but not Tree003. 003 died in 2010 (hence the NA values in df1), so 60.75 shouldn't be listed until 2010. Also, is there a way to change the row names (without actually using rownames(df2) <- ......) to read 2007 to 2012? Basically the 2012 values for DF2 should equal RBH - the 2012 value of DF1. Thanks! – KKL234 Jul 30 '14 at 19:35