Calculate new variable based on condition RStudio

Question

I want to create a new variable. Depending on the expression of variable X, variable Y is to be divided by one of three values.

I tried tho following code, but it doesn't work and I don't know how to do it in another way...

df %>% mutate(new_var =
                     case_when(X == "yes" ~  df$Y / 2, 
                               X == "no" ~  df$Y / 10,
                               X == "maybe" ~  df$Y / 12)
                               )

Glad for any help!

The code for my data set:

fixationtime <- fixationtime %>%
  mutate(FIX_weighted = case_when(
    outl_item == "outlier"   ~ durFIX / 2,
    outl_item == "no outlier"    ~ durFIX / 10,
    outl_item == "without outliers" ~ durFIX / 12
  ))

Subset of data for reproducing:

structure(list(outl_item = c("no outlier", "no outlier", "no outlier", 
"outlier", "outlier", "outlier", "no outlier", "without outlier", 
"without outlier", "no outlier", "no outlier", "outlier", "without outlier", 
"without outlier", "outlier", "outlier", "outlier", "without outlier", 
"without outlier", "no outlier", "outlier", "without outlier", 
"without outlier", "without outlier", "no outlier", "outlier", 
"without outlier", "no outlier", "without outlier", "without outlier"
), VP = structure(c(17L, 45L, 41L, 46L, 39L, 32L, 26L, 27L, 2L, 
39L, 32L, 36L, 17L, 29L, 13L, 26L, 45L, 10L, 11L, 38L, 9L, 32L, 
45L, 15L, 19L, 12L, 43L, 39L, 22L, 6L), levels = c("1", "2", 
"3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", "14", 
"15", "16", "17", "18", "19", "20", "21", "22", "23", "24", "25", 
"26", "27", "28", "29", "30", "31", "32", "33", "34", "35", "36", 
"37", "38", "39", "40", "41", "42", "43", "44", "45", "46"), class = "factor"), 
    TRIAL = c(3L, 34L, 36L, 27L, 26L, 6L, 11L, 13L, 30L, 37L, 
    38L, 36L, 40L, 14L, 23L, 37L, 22L, 14L, 16L, 11L, 11L, 40L, 
    23L, 36L, 38L, 6L, 23L, 35L, 12L, 33L), BLOCK = c(7L, 1L, 
    3L, 8L, 7L, 6L, 4L, 7L, 2L, 5L, 2L, 2L, 3L, 1L, 7L, 4L, 3L, 
    8L, 3L, 6L, 3L, 2L, 3L, 1L, 1L, 8L, 1L, 1L, 6L, 6L), OUTL = structure(c(2L, 
    2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 2L, 2L, 
    2L, 1L, 1L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 1L, 2L, 1L, 1L), levels = c("no outlier", 
    "outlier"), class = "factor"), Condition = c("Parallel trials with outliers", 
    "Parallel trials with outliers", "Parallel trials with outliers", 
    "Parallel trials with outliers", "Parallel trials with outliers", 
    "Parallel trials with outliers", "Parallel trials with outliers", 
    "Parallel trials without outliers", "Parallel trials without outliers", 
    "Parallel trials with outliers", "Parallel trials with outliers", 
    "Parallel trials with outliers", "Parallel trials without outliers", 
    "Parallel trials without outliers", "Parallel trials with outliers", 
    "Parallel trials with outliers", "Parallel trials with outliers", 
    "Parallel trials without outliers", "Parallel trials without outliers", 
    "Parallel trials with outliers", "Parallel trials with outliers", 
    "Parallel trials without outliers", "Parallel trials without outliers", 
    "Parallel trials without outliers", "Parallel trials with outliers", 
    "Parallel trials with outliers", "Parallel trials without outliers", 
    "Parallel trials with outliers", "Parallel trials without outliers", 
    "Parallel trials without outliers"), durFIX = c(1515L, 1657L, 
    1315L, 0L, 175L, 762L, 946L, 1780L, 1417L, 1719L, 1686L, 
    576L, 1711L, 1559L, 0L, 0L, 586L, 1792L, 1708L, 1532L, 624L, 
    1685L, 1717L, 1386L, 1426L, 227L, 1688L, 1581L, 1042L, 1345L
    )), row.names = c(NA, -30L), class = "data.frame")

Please edit the question to make it reproducible, by including data, such as the output from `dput(df)`. If you're receiving an error message, sharing that will also help others answer. At the very least, we need to know what you mean when you say "it doesn't work". In the code you've shared, you should not include `df$` in your `case_when` function. — Seth, Jun 08 '23 at 13:09

score 0 · Answer 1 · answered Jun 08 '23 at 21:53

0

As Seth said, if you have an error, post the error in your question alongside some sample data. This usually makes it a lot easier to figure out whats wrong. ;-)

From what I can see, your code should work if you apply it to a dataframe:

library(dplyr)

# Some sample data
df <- data.frame(
    X = c("yes", "no", "maybe", "no", "yes"),
    Y = c(10, 20, 30, 40, 50)
)

# Apply the mutation
df <- df %>%
    mutate(new_var = case_when(
        X == "yes"   ~ Y / 2,
        X == "no"    ~ Y / 10,
        X == "maybe" ~ Y / 12,
        TRUE         ~ NA_real_ # Added a fallback: Any other value for X then insert NA
    ))

# View the data
print(df)

Output:

answered Jun 08 '23 at 21:53

BrJ

574
3
7

Thank you! Yes, I also think that the code should work. However, the 3rd expression of X in the new variable says NA everywhere. I'll give a subset of my data. – Nori Jun 09 '23 at 06:58
Hi @Nori - In your data set the entry for `outl_item` is "without outlier", but in your `case_when` function you test for "without outliers". Removing the "s" in your conditional function returns the expected result. – Seth Jun 09 '23 at 15:34

Calculate new variable based on condition RStudio

1 Answers1