0

I have a dataframe as follows -

df <- cbind(c(1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,3,3,3,3,3),c(4,8,12,18,21,24,27,1,4,7,10,13,16,19,22,25,28,5,10,15,20,25), c(1.0,0.7,2.0,2.9,1.6,0.6,0.9,2,4,1,8,4,2,0.8,1.2,1.0,0.6,2,9,7,4,5))
colnames(df) <- c("ID","time","value")

So there are more than 1 local minima for both the ID. I want to identify if the post-peak local minima is greater or lesser than the pre-peak local minima and extract the IDs for which post-peak local minima is less than pre-peak local minima and capture the time for the the rise. So I want to create a column "index" which would be 0 if the pre-peak local minima is greater than post-peak local minima and would be equal to 1 if the pre-peak local minima is less than post-peak local minima. So for ID 1 and 2, the index column will be 0 but for 3 it will be 1. And subsequently I want to capture the time of the peak. So the resultant data-frame would be something like this

df1 <- cbind(c(1,2),c(18,10), c(0,0))
colnames(df1) <- c("ID","time","index")

I could capture the time of rise using this code -

df1 <- df%>%group_by(ID)%>%mutate(peak = which.max(c(diff(value),TRUE)))
df1 <- df1%>%group_by(ID)%>%filter(row_number == peak)

However, I am not being able to capture the "index" column based on comparison of the pre-peak vs. post-peak minima.

Please help me.

AnilGoyal
  • 25,297
  • 4
  • 27
  • 45
Biostats
  • 51
  • 8
  • Can you explain which is the post-rise local minima and pre-rise local minima in your data and how does one goes about selecting the `time` ? – Ronak Shah Mar 24 '21 at 10:31
  • @RonakShah the rise is the maximum value and the time is the time of the maximum value. For example, for ID 1, 2.9 is the maximum value or the peak and 18 is the time of rise....for ID 2, 8 is the maximum value or the peak and the time is 10....but for ID 3, the pre-peak minima is lesser than the post-peak rise...so even though there is a peak, it does not meet the criterion...because the criterion is that the pre-peak minima is greater than post-peak minima. – Biostats Mar 24 '21 at 11:38
  • @ronakshah the rise is the maximum value and the time is the time of the maximum value. For example, for ID 1, 2.9 is the maximum value or the peak and 18 is the time of rise....for ID 2, 8 is the maximum value or the peak and the time is 10....but for ID 3, the pre-peak minima is lesser than the post-peak rise...so even though there is a peak, it does not meet the criterion...because the criterion is that the pre-peak minima is greater than post-peak minima. – Biostats Mar 24 '21 at 11:38
  • Hi can anyone help me please??? – Biostats Mar 25 '21 at 01:17

1 Answers1

1

Does this help?

library(dplyr)

df %>%
  group_by(ID) %>%
  slice(which.max(value)) %>%
  ungroup %>%
  mutate(index = as.integer(lead(time, default = Inf) > time)) %>%
  filter(index == 0)

#     ID  time value index
#  <dbl> <dbl> <dbl> <int>
#1     1    18   2.9     0
#2     2    10   8       0
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213