2

I have a dataset like this:

data <- data.frame(Time = c(1,4,6,9,11,13,16, 25, 32, 65),
                  A = c(10, NA, 13, 2, 32, 19, 32, 34, 93, 12),
                  B = c(1, 99, 32, 31, 12, 13, NA, 13, NA, NA),
                  C = c(2, 32, NA, NA, NA, NA, NA, NA, NA, NA))

What I want to retrieve are the values in Time that corresponds to the last numerical value in A, B, and C. For example, the last numerical values for A, B, and C are 12, 13, and 32 respectively.

So, the Time values that correspond are 65, 25, and 4.

I've tried something like data[which(data$Time== max(data$A)), ], but this doesn't work.

Drew
  • 563
  • 2
  • 8

1 Answers1

2

We can multiply the row index with the logical matrix, and get the colMaxs (from matrixStats) to subset the 'Time' column

library(matrixStats)
data$Time[colMaxs((!is.na(data[-1])) * row(data[-1]))]
#[1] 65 25  4

Or using base R, we get the index with which/arr.ind, get the max index using a group by operation (tapply) and use that to extract the 'Time' value

m1 <- which(!is.na(data[-1]), arr.ind = TRUE)
data$Time[tapply(m1[,1], m1[,2], FUN = max)]
#[1] 65 25  4

Or with summarise/across in the devel version of dplyr

library(dplyr)
data %>% 
    summarise(across(A:C, ~ tail(Time[!is.na(.)], 1)))
#    A  B C
#1 65 25 4

Or using summarise_at with the current version of dplyr

data %>%
     summarise_at(vars(A:C), ~ tail(Time[!is.na(.)], 1))
akrun
  • 874,273
  • 37
  • 540
  • 662
  • 1
    Thank you Akrun. Using the dot in `!is.na(.)` tells R to use all the variables in the data frame, right (A:C) in this case? – Eric Jun 04 '20 at 20:45
  • 2
    @Mcmahoon89 The `~` is a shortcut for `function(x)` with `across` we are looping over each of the columns similar to `lapply(data[-1], function(x) !is.na(x))`. In tidyverse terminology, the `~` replaces the function(x) and `x` is replaced with `.` which is basically the vector of values of each column – akrun Jun 04 '20 at 20:48
  • 2
    @Mcmahoon89 I read your other comment. It is basically practise and of course following certain simple things each day. – akrun Jun 04 '20 at 20:49
  • Hi Akrun, how do we get access to the across() function. It doesn't seem to be available in my dplyr package. – Drew Jun 04 '20 at 20:57
  • 1
    @Drew sorry, i was using the devel version of dplyr. Thought, that the new CRAN version 1.0.0 also included that – akrun Jun 04 '20 at 20:58
  • 1
    @akrun Ah I see. Your second (Base R) solution worked for me, btw, so thank you. Is it possible for you to update the dplyr solution using the functions available in the new CRAN version? The dplyr one makes sense to me intuitively, and would love to see a solution there that doesn't require across() – Drew Jun 04 '20 at 21:03
  • 1
    @akrun Great. Thank you! – Drew Jun 04 '20 at 21:06