5

Depending on the text in one column I want to assign a character and a integer to two other columns. Multiple case_when conditions (LHS) for assigning the character to one column and the integer to another column are equal only the outcome (RHS) is different. I am using exprs and !!! because I want to maintain the base of the expressions list in only one table.

My code is:

library(rlang)
library(tidyverse)

df <- data.frame(a=c("text-1" , "text_2", "text3"))

e1 <- 
  exprs(
    grepl("text-", a) ~ "a",
    grepl("text_", a) ~ "b",
    grepl("text[0-9]", a) ~ "c"
  )

e2 <- 
  exprs(
    grepl("text-", a) ~ 0,
    grepl("text_", a) ~ 1,
    grepl("text[0-9]", a) ~ 2
  )

test <- df %>% mutate(b=case_when(!!!e1),
                      c=case_when(!!!e2)
)

And expected outcome is:

> test
       a b c
1 text-1 a 0
2 text_2 b 1
3  text3 c 2

But it seems redundant and inefficient (with millions of records) to use two case_when expression lists with the same LHS. How can I reach the same result more efficiently?

Nils
  • 120
  • 1
  • 7
  • 1
    c column is for instance 0,1,2 (edited) or something else arbitrary, thanks zx8754 – Nils Aug 26 '19 at 10:10
  • 1
    left_join is an option when there are only == conditions, but there are regular expressions involved, I'll edit again, thanks Cole – Nils Aug 26 '19 at 10:12
  • Do you want to use only one `case_when` ? but then how would you evaluate different RHS if your LHS is exactly the same? – Ronak Shah Aug 26 '19 at 10:18
  • I'd write a function that returns 2 column dataframe, then cbind? – zx8754 Aug 26 '19 at 10:19
  • Hi Ronak, yes that is actually my question.. – Nils Aug 26 '19 at 10:25
  • In this specific case you can map `c(0, 1, 2)` to `c(a, b, c)` with `letters[c(0, 1, 2) + 1]`, which could be used to replace one of the `case_when` expressions. – Joris C. Aug 26 '19 at 12:46

1 Answers1

2

The main motive behind this is not clear to me but using @zx8754's suggestion we can do

library(dplyr)
library(rlang)

e1 <- exprs(
      grepl("text-", a) ~ "a, 0",
      grepl("text_", a) ~ "b, 1",
      grepl("text[0-9]", a) ~ "c, 2")

df %>% 
  mutate(b=case_when(!!!e1)) %>%
  tidyr::separate(b, into = c("b", "c"), sep = ",", convert = TRUE)

#       a b  c
#1 text-1 a  0
#2 text_2 b  1
#3  text3 c  2
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
  • Thanks Ronak, the motive would be not to evaluate each condition twice which seems inefficient, there are quitte a few records to evaluate. So I also thought of a ```separate``` kind of solution. No one command solution available? – Nils Aug 26 '19 at 10:33
  • Adding ```convert=T``` to the arguments of ```separate``` turns column c into integer – Nils Aug 28 '19 at 11:00