0

I have 3 columns of data with 10 rows in each column as below

set.seed(101)
inputx <- rnorm(1000,mean = 3,sd=2)
inputy <- rnorm(1000,mean = 2,sd=1)
inputz <- rnorm(1000,mean = 1,sd=3)
example <- cbind(inputx,inputy,inputz)

    > head(example,10)
        inputx      inputy     inputz
 [1,] 2.347927  2.50319581  4.4931430
 [2,] 4.104924 -0.09747067 -0.2836938
 [3,] 1.650112  1.90551542  0.9146087
 [4,] 3.428719  3.01454241  4.9332929
 [5,] 3.621538  1.92059955  2.4220865
 [6,] 5.347933  1.74487106  2.9122764
 [7,] 4.237580  2.78095054  7.8622898
 [8,] 2.774531  3.20741266 -1.5977934
 [9,] 4.834057  1.09214734 -0.5482315
[10,] 2.553481  0.59679215  0.5285020

My actual data has 10 variables in total but for simplicity of this example I just used 3.

For each permutation as per below I would like to calculate the ccf using a rolling window of size 4

    inputx,inputx
    inputx,inputy
    inputx,inputz
    inputy,inputx
    inputy,inputy
    inputy,inputz
    inputz,inputx
    inputz,inputy
    inputz,inputz

e.g

ccf(example[1:4,1],example[1:4,2]) 
ccf(example[2:5,1],example[2:5,2])
.
.
.
ccf(example[7:10,1],example[7:10,2])

As you can see this sample above just works on columns 1 and 2 but I hope to do it for all columns using the rolling window approach.

The rolling window procedure is easily handled using the rollapply function where you specify the size of the window and by specifying by.column as FALSE so that it doesn't apply the function on each column separately.

If I wanted to roll apply a function such as mean to each column I could do that but the looping across columns for all combinations just blows my mind and I can't figure it out.

test <- rollapply(example[,c(1,2)],4,mean, by.column=TRUE)

For the output the ccf results should be stored by row and should be of dimensions 10x9 since there's 10 rows in the original data and the 9 permutations of the ccf function. The first 3 rows of the output will be NA since the rolling window uses a size of 4. The values below are just for illustrative purposes and are not the real output values.

output ->

     xx  xy  xz  yx  yy  yz  zx  zy  zz
[1,] NA  NA  NA  NA  NA  NA  NA  NA  NA  
[2,] NA  NA  NA  NA  NA  NA  NA  NA  NA 
[3,] NA  NA  NA  NA  NA  NA  NA  NA  NA  
[4,] .1  .2  .3  .2  .8  .5  .3  .5  .9 
[5,] .1  .2  .3  .2  .8  .5  .3  .5  .9  
[6,] .1  .2  .3  .2  .8  .5  .3  .5  .9 
[7,] .1  .2  .3  .2  .8  .5  .3  .5  .9  
[8,] .1  .2  .3  .2  .8  .5  .3  .5  .9 
[9,] .1  .2  .3  .2  .8  .5  .3  .5  .9  
[10,].1  .2  .3  .2  .8  .5  .3  .5  .9 

I would appreciate a little help in applying the looping across all permutations, I think if I got that I could then do the rollapply wrapper to implement the sliding window.

TheGoat
  • 2,587
  • 3
  • 25
  • 58
  • Did you mean `cor` or `cov` rather than `ccf`? The ccf of two 4-vectors is a 7-vector so the output of all 9 possibilities is 7 * 9 = 63 long. – G. Grothendieck Aug 09 '17 at 01:00
  • @G.Grothendieck, thanks for your reply. I used ccf as it's a simple function which takes two input variables, any two input function could be used. I guess the main point is to take a matrix/dataframe and apply a rolling function across all column permutations. I asked this question [here](https://stackoverflow.com/questions/45533719/wavelet-correlation-using-a-rolling-window) which I gave a shout out to you but got no replies so I tried to simplify my question to attract an answer, I guess mentioning wavelets scared some people off but the idea is still the same. – TheGoat Aug 09 '17 at 12:08

3 Answers3

1

1) ccf(x, y) of two 4-vectors x and y gives a 7-vector so the output would have 3 * 3 * 7 = 63 columns, not 9, as stated in the question.

In a comment the poster stated that another function could be substituted for ccf so below we assume cov(x, y) since that outputs a scalar rather than a 7-vector and so would give a 10 x 9 output. In this particular case cov(cbind(x, y, z)) produces a 3x3 matrix which when flattened gives a 9-vector.

rollapplyr(head(example, 10), 4, function(x) c(cov(x)), fill = NA, by.column = FALSE)

giving the following 10x9 matrix:

           [,1]        [,2]       [,3]        [,4]      [,5]       [,6]       [,7]       [,8]      [,9]
 [1,]        NA          NA         NA          NA        NA         NA         NA         NA        NA
 [2,]        NA          NA         NA          NA        NA         NA         NA         NA        NA
 [3,]        NA          NA         NA          NA        NA         NA         NA         NA        NA
 [4,] 1.1990739 -0.72070179 -0.3951435 -0.72070179 1.8590569  3.1565993 -0.3951435  3.1565993  6.718376
 [5,] 1.1503463 -0.51712419  0.1548365 -0.51712419 1.6830055  2.6102211  0.1548365  2.6102211  5.058550
 [6,] 2.2854029 -0.12857123  1.1658204 -0.12857123 0.3413027  0.7821381  1.1658204  0.7821381  2.753662
 [7,] 0.7473036 -0.31336885 -0.2743693 -0.31336885 0.3923239  1.1959920 -0.2743693  1.1959920  6.109035
 [8,] 1.1727627 -0.53344663  2.2960862 -0.53344663 0.4851109 -0.5067012  2.2960862 -0.5067012 15.027672
 [9,] 1.2381071 -0.88053417  1.5728089 -0.88053417 0.9289009  0.7283704  1.5728089  0.7283704 18.179175
[10,] 1.2353345 -0.05021654  1.7008923 -0.05021654 1.6116281  1.4902571  1.7008923  1.4902571 18.399713

2) or this which gives the same result:

k <- ncol(example)
g <- expand.grid(1:k, 1:k)
Cov <- function(x) apply(g, 1, function(ix) cov(x[, ix[1]], x[, ix[2]]))
rollapplyr(head(example, 10), 4, Cov, by.column = FALSE, fill = NA)

Note that in the case of cov it produces a symmetric matrix (the 3 lower triangular values equal the 3 upper triangular values) so we might only wish to output the diagonal and upper triangular part and if that is the case we could use upper.tri to subset it.

G. Grothendieck
  • 254,981
  • 17
  • 203
  • 341
0

Use expand.grid to generate a list of combinations. Then you can do something like:

combos <- expand.grid(colnames(example), colnames(example), stringsAsFactors=F)

sapply(1:nrow(combos), function(j) {
    MY_CALCULATION
})
thc
  • 9,527
  • 1
  • 24
  • 39
0

I really cant tell what you are trying to achieve. I dont see the function rollapply in base R but I know you can use the combn function: I will give you all the 4 window combinations for the example above but just for 10 numbers. I cannot produce for the 1000 numbers but you can change the code to do that.

  set.seed(101)
  inputx <- rnorm(10,mean = 3,sd=2)
  inputy <- rnorm(10,mean = 2,sd=1)
  inputz <- rnorm(10,mean = 1,sd=3)
  example <- cbind(inputx,inputy,inputz)# Has only 10 rows. just for illustration

 a=lapply(1:(nrow(example)-3),function(x) x:(x+3))# all the four window combinations
 Map(combn,ncol(example),2,c(function(x,y) example[y,x]),y=a)

The code above gives you the whole data broken down the way you want. you can then manipulate it as you wish. If this is of help please do let us know. Thank you

Onyambu
  • 67,392
  • 3
  • 24
  • 53