1

My ifelse statement: df$baseline <- ifelse(df$value <= -20, "yes", "") However, I only want it labelled "yes" for the first occurrence of the -20 value for each id. For example, id = 3 should only have a 'yes' for the value = -46.96.

Can this even be done with just an ifelse statement since it has to be grouped by id?

     id        value      yes
1     1           NA     <NA>
2     1       -27.17      yes
3     2           NA     <NA>
4     2       -18.69         
5     2        17.27         
6     2       -34.38      yes
7     3           NA     <NA>
8     3       134.50         
9     3       -46.96      yes
10    3        88.18         
11    3       -32.27      yes -> SHOULD BE ""
12    3        -0.40         
13    3        36.69         
phoxis
  • 60,131
  • 14
  • 81
  • 117
Theresa
  • 65
  • 5
  • 3
    Whenever you use a phrase like *"for each id"*, probably you should look for a grouped solution, probably with `data.table` or `dplyr`. – Gregor Thomas Mar 13 '19 at 16:03

2 Answers2

2

The NAs make things a little tricky, but here's a solution with dplyr:

library(dplyr)
df %>%
  group_by(id) %>%
  mutate(
    baseline = ifelse(value <= -20, "yes", ""),
    baseline = ifelse(baseline == "yes" & cumsum(baseline == "yes" & !is.na(baseline)) > 1, "", baseline)
  )

# # A tibble: 13 x 3
# # Groups:   id [3]
#       id  value baseline
#    <int>  <dbl> <chr>   
#  1     1   NA   NA      
#  2     1  -27.2 yes     
#  3     2   NA   NA      
#  4     2  -18.7 ""      
#  5     2   17.3 ""      
#  6     2  -34.4 yes     
#  7     3   NA   NA      
#  8     3  134.  ""      
#  9     3  -47.0 yes     
# 10     3   88.2 ""      
# 11     3  -32.3 ""      
# 12     3   -0.4 ""      
# 13     3   36.7 ""  

Using this data:

df = read.table(header = T, text = "     id        value   
1     1           NA   
2     1       -27.17   
3     2           NA   
4     2       -18.69         
5     2        17.27         
6     2       -34.38   
7     3           NA   
8     3       134.50         
9     3       -46.96   
10    3        88.18         
11    3       -32.27   
12    3        -0.40         
13    3        36.69   ")
Gregor Thomas
  • 136,190
  • 20
  • 167
  • 294
2

Use ave to apply, for each id, the indicated function yes. That function uses ifelse to create a vector which is NA when value <= -20 is NA and "" otherwise. It then replaces the first position for which value <= -20 is true with "yes".

yes <- function(x) replace(ifelse(is.na(x), NA, ""), which(x)[1], "yes")
transform(df, yes = ave(value <= -20, id, FUN = yes))

giving:

   id  value  yes
1   1     NA <NA>
2   1 -27.17  yes
3   2     NA <NA>
4   2 -18.69     
5   2  17.27     
6   2 -34.38  yes
7   3     NA <NA>
8   3 134.50     
9   3 -46.96  yes
10  3  88.18     
11  3 -32.27     
12  3  -0.40     
13  3  36.69     
G. Grothendieck
  • 254,981
  • 17
  • 203
  • 341