4

I am trying to recode a character variable with dplyr::recode() and stringr::str_detect(). I realize that this can be done with dplyr::case_when(), as documented here: https://community.rstudio.com/t/recoding-using-str-detect/5141, but I am convinced that there has to be a way of doing it via recode().

Consider this case:

library(tidyverse)
rm(list = ls())

data <- tribble(
  ~id, ~time,
  #--|--|
  1, "a",
  2, "b",
  3, "x"
)

I would like to replace the "x" in the dataframe with a "c" via str_detect() and here's how I'd do it:

data %>% 
 mutate(time = recode(data$time, str_detect(data$time, "x") = "c"))

But that doesn't work:

Error: unexpected '=' in: "data %>% mutate(time = recode(data$time, str_detect(data$time, "x") ="

Apparently R doesn't know what to do with the last =, but I believe it has to be there for the recode function, as demonstrated here:

recode(data$time, "x" = "c")

This executes properly, as does this:

str_detect(data$time, "x")

But this does not:

recode(data$time, str_detect(data$time, "x") = "c")

Is there a way of getting these two functions to work with each other?

tc_data
  • 113
  • 2
  • 8
  • 1
    `str_detect` returns `TRUE` or `FALSE`, not the character you are looking for. Either use `gsub` or if you want to use `str_detect`, `case_when` or `ifelse`. – phiver Apr 04 '18 at 14:33
  • So that is the problem. `recode()` does not understand what to do with `TRUE` instead of the actual character, I see. – tc_data Apr 04 '18 at 14:37

1 Answers1

7

If you want as simple as possible for this, I'd use gsub

library(dplyr)
data %>% 
  mutate(time = gsub("x", "c", time))

That eliminates the use of recode and str_detect

If you're dead set on using stringr, then you should use str_replace rather than str_detect:

data %>% 
  mutate(time = str_replace(time, "x", "c"))

If you want to replace the entire value if it contains an 'x', then just add some regex:

data %>% 
  mutate(time = str_replace(time, ".*x.*", "c"))

Breakdown of the regex: .* represents any character (except \n) matching at least 0 times. We put .* both in front and behind the x, so that way if there are any leading or trailing characters from the 'x', they are still captured.

Dave Gruenewald
  • 5,329
  • 1
  • 23
  • 35
  • this is unfortunately not an option, as I need str_detect to target and replace certain observations in my actual data. – tc_data Apr 04 '18 at 14:35
  • see my edits if you want to use `stringr`. `str_detect` just does not feel appropriate for your case. But to be honest, you just described `gsub` with your comment – Dave Gruenewald Apr 04 '18 at 14:43
  • I have a hunch you might be doing the same coursework as this question asked yesterday: https://stackoverflow.com/q/49632442/6535514 – Dave Gruenewald Apr 04 '18 at 14:48
  • like `gsub`, `str_replace` only replaces the string that I tell it to look for and leaves eventual trailing characters intact. this is not what I am looking for, hence the need for `str_detect`. it seems like `case_when` really is the only appropriate solution for this purpose. >edit: interesting coincidence, but I'm trying to recode a time variable into an ordered factor with this. I'll stick to `case_when` for that purpose – tc_data Apr 04 '18 at 14:48
  • Oh, then you just need to modify your pattern to include more specific regex. As is, your original question just said you wanted to chance 'x' to 'c' – Dave Gruenewald Apr 04 '18 at 14:51
  • and there we have it! This seems to work as intended, but ultimately `case_when` has proven to be the best solution to this, for the reasons @phiver pointed out. – tc_data Apr 04 '18 at 14:58