0

I have a data frame of 22 columns and 240 rows.

I want to create a new column with Likert Scale numeric values depending the column value.

I want to have a new column named Result with Result == 2 if one of the columns of secretary rows >= 1 else Result == 0

and Result == -2 if one of the columns of driver rows == 0 else Result == 0

the second data frame is an exemple of the result i want my code that doesn't work:

   profession ve.count.descrition euse.count.description ne.count.title
   secretary   0                      1                      2
   secretary   0                      2                      1
   driver      1                      1                      0
   driver      0                      0                      0

data <- data %>%
 for (Result in Profession){
   if(secretary [ve.count.description:ne.count.title] >= 1){
        Result == 2
   }else{Result == 0
   }
   if(driver[ve.count.description:ne.count.title] == 0){
     Result == -2
   }else{result == 0
   }
 }
mutate(result,data)

output : 
Error in for (. in result) Profession : 
 4 arguments passed to 'for' which requires 3
profession ve.count.descrition euse.count.description ne.count.title Result
secretary   0                      1                      2            2
secretary   0                      2                      1            2
driver      1                      1                      0            0
driver      0                      0                      0           -2
Fella
  • 21
  • 5
  • I think you can do it with dplyrs mutate and case_when() function. This should work. https://www.datasciencemadesimple.com/case-statement-r-using-case_when-dplyr/ – pbraeutigm May 16 '22 at 12:30

2 Answers2

3

You can use case_when, like this:

    data %>%
     mutate(Result = case_when(
      profession == "secretary" & (e.count.descrition >= 1 |
                                   euse.count.description >=1 | 
                                   ne.count.title >= 1) ~ 2,
      profession == "driver" & (e.count.descrition == 0 |
                                euse.count.description == 0 | 
                                ne.count.title == 0) ~ -2,
      TRUE ~ 0
  ))
Lucca Nielsen
  • 1,497
  • 3
  • 16
  • 1
    Note to OP: WRT LN's suggested case_when() approach, note that R has MANY functions optimized to search through vector, array, table, and list objects of all sorts. R uses the structure itself to supply the looping framework and implements it in code which is usually much more optimized than user-defined for loops. This is a basic cognitive schema one must get comfortable with to work well in any R flavor/approach--including dplyr as here. – John Garland May 16 '22 at 12:45
  • Tidyverse logic works great for me. It's more intuitive. Maybe because my background is Public Health and not anything related to programming and data specifically. – Lucca Nielsen May 16 '22 at 12:48
1

Another option:

library(tidyverse)

tribble(
  ~profession, ~ve.count.descrition, ~euse.count.description, ~ne.count.title,
  "secretary", 0, 1, 2,
  "secretary", 0, 2, 1,
  "driver", 1, 1, 0,
  "driver", 0, 0, 0
) |>
  rowwise() |> 
  mutate(result = case_when(
    profession == "secretary" & any(c_across(2:4) >= 1) ~ 2,
    profession == "driver" & any(c_across(2:4) == 0) ~ -2,
    TRUE ~ 0
  ))
#> # A tibble: 4 × 5
#> # Rowwise: 
#>   profession ve.count.descrition euse.count.description ne.count.title result
#>   <chr>                    <dbl>                  <dbl>          <dbl>  <dbl>
#> 1 secretary                    0                      1              2      2
#> 2 secretary                    0                      2              1      2
#> 3 driver                       1                      1              0     -2
#> 4 driver                       0                      0              0     -2

Created on 2022-05-16 by the reprex package (v2.0.1)

Carl
  • 4,232
  • 2
  • 12
  • 24