4

I am trying to make a data frame with the maximum over records by a factor. I would like a data frame with 4 rows (one for each G) with the max for X in that group and the corresponding Y value. I know I could write a loop but would rather not.

Data<-data.frame(X=rnorm(200), Y=rnorm(200), G=rep(c(1,2,3,4), each=50))
XMax<-tapply(Data$X, Data$G, function(x){max(x, na.rm=T)})
WhichXMax<-tapply(Data$X, Data$G, function(x){which.max(x)})

The which.max function returns the row number after the data has been subsetted by the tapply factor, where I really want the row number referencing the Data rows. So I could do something like;

YMax<-Data$Y[Which]
MaxData<-data.frame(XMax=XMax, YMax=YMax, G=levels(Data$G))
Machavity
  • 30,841
  • 27
  • 92
  • 100
LoveMeow
  • 1,141
  • 2
  • 15
  • 26

3 Answers3

7
library(dplyr)
Data %>% 
    group_by(G) %>% 
    filter(X==max(X))

If you don't want to include ties, then

Data %>%
    group_by(G) %>%
    arrange(desc(X)) %>%
    slice(1)
Gregor Thomas
  • 136,190
  • 20
  • 167
  • 294
ExperimenteR
  • 4,453
  • 1
  • 15
  • 19
  • I have tried this code on my 'real data' and it gives me 6 more rows than there are in length(levels(Data$G)) any ideas? It looks like it reports both rows if there is a tie? Whereas the accepted answer just chooses one for the tie. Also could you please explain the operation %<%, I have not seen that one before! :) – LoveMeow May 19 '15 at 07:07
6

You can use by and reference the rownames of the row returned by which.max:

Data[by(Data, Data$G, function(dat) rownames(dat)[which.max(dat$X)] ),]

#           X          Y G
#4   1.595281 -0.3309078 1
#61  2.401618  0.9510128 2
#147 2.087167  0.9160193 3
#171 2.307978 -0.3887222 4

(This assumes set.seed(1) for reproducibility's sake)

thelatemail
  • 91,185
  • 12
  • 128
  • 188
5
  library(data.table)
  set.seed(1)
  Data<-data.frame(X=rnorm(200), Y=rnorm(200), G=rep(c(1,2,3,4), each=50))
  setDT(Data)[,list(X=max(X),Y=Y[which.max(X)]),by=G]
   G        X          Y
1: 1 1.595281 -0.3309078
2: 2 2.401618  0.9510128
3: 3 2.087167  0.9160193
4: 4 2.307978 -0.3887222
user227710
  • 3,164
  • 18
  • 35