Averaging values between paired columns across a large data frame

Question

I have a dataframe consisting of a series of paired columns. Here is a small example.

df1 <- as.data.frame(matrix(sample(0:1000, 36*10, replace=TRUE), ncol=1))
df2 <- as.data.frame(rep(1:12, each=30))
df3 <- as.data.frame(matrix(sample(0:500, 36*10, replace=TRUE), ncol=1))
df4 <- as.data.frame(c(rep(5:12, each=30),rep(1:4, each=30)))
df5 <- as.data.frame(matrix(sample(0:200, 36*10, replace=TRUE), ncol=1))
df6 <- as.data.frame(c(rep(8:12, each=30),rep(1:7, each=30)))
Example <- cbind(df1,df2,df3,df4,df5,df6)

What I would like to do is find an average value for the odd numbers columns (df1,df3,df5) based on the values in the adjacent column, so in the example I would have three sets of averages for each value between 1 and 12. I have managed to apply a function for a specific pair of columns...

Example_two <- cbind(df1,df2)
colnames (Example_two) <- c("x","y")
tapply(Example_two$x, Example_two$y, mean)

However, the dataframe I will be looking at will be considerably larger so some form of apply function would be ideal to perform this iteratively across each paired set. I have found a similar problem Is there a R function that applies a function to each pair of columns?, but I can't seem to apply this to my own dataset.

Any help would be much appreciated, thank you in advance.

Do you need to get the average value (summary) as a separate dataset or as columns in Example? — akrun, May 18 '15 at 11:18

score 2 · Accepted Answer · answered May 18 '15 at 11:23

2

Try

 mapply(function(x,y) tapply(x,y, FUN=mean) , 
    Example[seq(1, ncol(Example), 2)], Example[seq(2, ncol(Example), 2)])

Or instead of seq(1, ncol(Example), 2) just use c(TRUE, FALSE) and c(FALSE, TRUE) for the second case

answered May 18 '15 at 11:23

akrun

874,273
37
540
662

@JamesWhite Glad to know that it works. This could be done in a couple of ways, but I thought `mapply` would be easier – akrun May 18 '15 at 11:33

Averaging values between paired columns across a large data frame

1 Answers1