5

I am quite familiar with R as I've been using it for a few years now. Unfortunately, I am not very well versed in creating functions that involve looping or repetition of an equation. The problem goes as follows:

I have a vector containing over 1000 values. I would like to calculate the absolute difference between two juxtaposing means of equal size from a subset of that vector.

Here is an an example.

I have the vector (vec) of length 8

 [1]  0.12472963  1.15341289 -1.09662288 -0.73241639  0.06437658 -0.13647136 -1.52592048  1.46450084  

I would like calculate the mean of the first 2 values ( 0.12472963, 1.15341289) and obtain the absolute difference with the mean of the following 2 values (-1.09662288 -0.73241639), thereafter, working my way down the vector.

In this case, I can easily use the following equation:

abs(mean(vec[1:2])-mean(vec[3:4]))

and incrementally increase each number by 1 so as to work my way down manually until the end of the vector. I would obtain the following vector.

[1]  1.553591  0.3624149  0.8784722  0.497176  0.005337574

What I would like, however, to have an automated routine which enables be me to do that over long vectors and change the number of values from which to calculate the means.

It appears to me that it should be relatively simple, but I do not know where to start.

hrbrmstr
  • 77,368
  • 11
  • 139
  • 205

3 Answers3

7

Use filter:

c(abs(filter(vec, c(0.5, 0.5, -0.5, -0.5), sides=1)[-(1:3)]))
#[1] 1.55359090 0.36241491 0.87847224 0.49717601 0.00533757
Roland
  • 127,288
  • 10
  • 191
  • 288
2

Using rollapply from zoo

 library(zoo)
 n <- 2
 n1 <- length(vec)

 abs(rollapply(vec[1:(n1-n)], 2, mean)-rollapply(vec[(n+1):n1], 2,mean))
 #[1] 1.55359090 0.36241491 0.87847224 0.49717601 0.00533757

Also, other variations of the above code are (from commented by @G. Grothendieck- one of the authors of zoo package)

  abs(rollmean(vec[1:(n1-n)], 2) - rollmean(vec[(n+1):n1], 2)) #using 
  #`rollmean` instead of `rollapply`

or

  rollapply(vec, 4, function(x) abs(mean(x[1:2]) - mean(x[3:4])))

or

  abs(rollapply(vec, 4, "%*%", c(1, 1, -1, -1)/2))
akrun
  • 874,273
  • 37
  • 540
  • 662
  • 1
    These variations also work: `abs(rollmean(vec[1:(n1-n)], 2) - rollmean(vec[(n+1):n1], 2))` and `rollapply(vec, 4, function(x) abs(mean(x[1:2]) - mean(x[3:4])))` and `abs(rollapply(vec, 4, "%*%", c(1, 1, -1, -1)/2))`. – G. Grothendieck Oct 22 '14 at 23:06
  • @G. Grothendieck Thanks for the comment and providing the variations. – akrun Oct 23 '14 at 03:02
1

As always, I chime in with:

vec<-rep(c(  0.12472963 , 1.15341289, -1.09662288, -0.73241639 , 0.06437658, -0.13647136 ,-1.52592048 , 1.46450084  ),100)

microbenchmark(roland(vec),akrun(vec),times=3)

Unit: microseconds
         expr       min         lq       mean    median        uq       max
  roland(vec)   564.128   565.2275   647.3353   566.327   688.939   811.551
   akrun(vec)  3717.410  3982.1535  4218.3057  4246.897  4468.753  4690.610
 neval
     3
     3
Carl Witthoft
  • 20,573
  • 9
  • 43
  • 73
  • 1
    I've updated my answer with a slight change that should improve performance and ensures correct handling of `NA` values. – Roland Oct 23 '14 at 06:56