1

I want to find the cumulative sum of a bunch of columns as explained in the R: ddply repeats yearly cumulative data. That is,

ddply(mydf, "year", transform, 
      cumsum1 = cumsum(myvalue1), 
      cumsum2 = cumsum(myvalue2))

I tried the following.

Solution 1:

1.Created a list of destination names for cumulative sum and a list of source names.

2.Ran ddply(mydf,"year",transform,dstnList=srcList)

3.Getting the following error:

"arguments imply differing number of rows: 1385, 280
In addition: Warning message:
In eval(expr, envir, enclos) : NAs introduced by coercion"

Solution 2 :

1.Create a following function.

findCumSum<-function(srcdf,columnlist){  
  for (i in 1:length(columnlist)){  
    ddply(srcdf,"g_id",transform,cumsum(names(srcdf)[columnlist[i]]))  
  }  
  srcdf  
}

2.Call the function with list of srcList. findCumSum(mydf,srcIdxList);

I am getting the following error

"Error in eval(expr, envir, enclos) : object 'srcdf' not found" 

Let me know how to solve the problem.

Community
  • 1
  • 1
Ramakrishnan Kannan
  • 604
  • 1
  • 11
  • 24

1 Answers1

3

Maybe something like this?

dat <- data.frame(x = rep(letters[1:2],each = 10),
                  y1 = 1:20,
                  y2 = 20:1)

> ddply(dat,.(x),colwise(cumsum))
   x  y1  y2
1  a   1  20
2  a   3  39
3  a   6  57
4  a  10  74
5  a  15  90
6  a  21 105
7  a  28 119
8  a  36 132
9  a  45 144
10 a  55 155
11 b  11  10
12 b  23  19
13 b  36  27
14 b  50  34
15 b  65  40
16 b  81  45
17 b  98  49
18 b 116  52
19 b 135  54
20 b 155  55
joran
  • 169,992
  • 32
  • 429
  • 468