3

I'm looking to filter a data.table for a max value on groups.

DT <- data.table(V1 = c(1L, 2L),
                 V2 = LETTERS[1:3],
                 V3 = round(rnorm(4), 4),
                 V4 = 1:12)

 V1 V2     V3     V4
 1:  1  A -0.1346  1
 2:  2  A  0.2309  4
 3:  1  A  0.7067  7
 4:  2  A -1.0082 10
 5:  2  B -1.0082  2
 6:  1  B -0.1346  5
 7:  2  B  0.2309  8
 8:  1  B  0.7067 11
 9:  1  C  0.7067  3
10:  2  C -1.0082  6
11:  1  C -0.1346  9
12:  2  C  0.2309 12

I've tried this but no dice:

DT[,max(V3), by =.(V2)]

   V2     V1
1:  A 1.2281
2:  B 1.2281
3:  C 1.2281

Short of a loop, how would I approach this? I prefer a data.table method.

Todd Shannon
  • 527
  • 1
  • 6
  • 20
  • What's the issue? Everything seems to be working as intended. When you create your data table, it will have size 12 (because of V4), but you have 4 random values and 3 letters, therefore the values will repeat themselves, making it so the maximum for each letter will always be the same. – Luis Jun 06 '18 at 19:57

1 Answers1

2

We can create a row index to subset the dataset

DT[DT[, .I[V3 == max(V3)], by = V2]$V1]

If there is only single max element for each 'V2'

DT[DT[, .I[which.max(V3)], by = V2]$V1]
akrun
  • 874,273
  • 37
  • 540
  • 662
  • I think there is a typo in the second line of code. It should be `DT[DT[, .I[which.max(V3)], by = V2]$V1]`, right? – otwtm May 12 '20 at 07:55