0

I'm relatively new to R and am having trouble processing my data into a more workable form. If I had a continuous x and y vector, some with with multiple x values for the same y value how would I go about writing a script which could automatically average those multiple x values and create a new data.set with the the average x values and y values of the same length. An example is included below.

X <- c(34.2, 35.3, 32.1, 33.0, 34.7, 34.2, 34.1, 34.0, 34.1)
Y <- c(90.1, 90.1, 72.5, 63.1, 45.1, 22.2, 22.2, 22.2,  5.6)
Brian Tompsett - 汤莱恩
  • 5,753
  • 72
  • 57
  • 129

3 Answers3

1

I think this does what you want. The aggregate function will group y by x in this case and take the mean.

x<-c(34.2,35.3,32.1,33.0,34.7, 34.2, 34.1, 34.0, 34.1)
y<-c(90.1, 90.1, 72.5, 63.1, 45.1, 22.2, 22.2, 22.2,  5.6 )
df<-data.frame(x=x,y=y)

df2<-aggregate(y~.,data=df,FUN=mean) 
df2
Jason
  • 1,559
  • 1
  • 9
  • 14
  • Thank you for the help. That is what I wanted. There's always several ways to do things in R. – Trevor Eakes Jan 30 '15 at 22:03
  • You're welcome Trevor. That's what we are here for. If you have a sec, you might give out the check mark somewhere to close the ticket so to speak. – Jason Jan 31 '15 at 00:04
1

I assume you want the average for each Y value

Try this:

X <- c(34.2, 35.3, 32.1, 33.0, 34.7, 34.2, 34.1, 34.0, 34.1)
Y <- c(90.1, 90.1, 72.5, 63.1, 45.1, 22.2, 22.2, 22.2,  5.6)
xy <- cbind(X,Y)
xy<- as.data.frame(xy)
tapply( X = xy$X,INDEX = list(xy$Y),FUN = mean )
Skiptoniam
  • 91
  • 1
  • 7
0

If I understand you correctly, you want a new dataset in which for every Y value, you have the average of the corresponding X values. Using the fact that an average of a vector of length 1 is just that value to handle singletons, this can be done easily with dplyr.

X <- c(34.2, 35.3, 32.1, 33.0, 34.7, 34.2, 34.1, 34.0, 34.1)
Y <- c(90.1, 90.1, 72.5, 63.1, 45.1, 22.2, 22.2, 22.2,  5.6)
Df <- data.frame(X, Y)
> Df
     X    Y
1 34.2 90.1
2 35.3 90.1
3 32.1 72.5
4 33.0 63.1
5 34.7 45.1
6 34.2 22.2
7 34.1 22.2
8 34.0 22.2
9 34.1  5.6

Now:

library(dplyr)
Df2 <- Df %>% group_by(Y) %>% summarize(X = mean(X))
> Df2
Source: local data frame [6 x 2]

     Y     X
1  5.6 34.10
2 22.2 34.10
3 45.1 34.70
4 63.1 33.00
5 72.5 32.10
6 90.1 34.75
Avraham
  • 1,655
  • 19
  • 32