1

I would like to create a new variable based on the answers to three other variables (A,B and C) in my data set. Basically my variables have three modalities : "often", "sometime" and "never". I would like to have a new variable in which each individuals has a grade ranging from 0 to 3. For each variable (A,B and C), if he answers "often", he gets 1 point otherwise, he gets 0.

My data set looks like this with "often" coded with 2 ; "sometimes" coded with 1 and "never" coded with 0.

A <- c(2,1,1,NA, 0,2)
B <- c(2,2,0,1,2,NA)
C <- c(2,1,NA,2,1,0)

data <- data.frame(A,B,C)

I know I could use case_when but it is a rather unwieldy solution. I was thinking of a loop but I never used loops in R. Can you help me with this loop?

1 Answers1

1

Do you mean something like this?

Update: thanks to markus. His solution (rowSums(data == 2, na.rm = TRUE))is much better than my original


base R

data$points = rowSums(data == 2, na.rm = TRUE)

dplyr

library(dplyr)

data %>% mutate(point = rowSums(data == 2, na.rm = TRUE))

data.table

library(data.table)

setDT(data)

data[, points:=rowSums(data == 2, na.rm = TRUE)]

Output

> data
   A  B  C points
1  2  2  2      3
2  1  2  1      1
3  1  0 NA      0
4 NA  1  2      1
5  0  2  1      1
6  2 NA  0      1
Marco_CH
  • 3,243
  • 8
  • 25
  • Yes exactly! And if I have other colums (let's say D and E) that i don't want to integrate in the row sum, can I specify that I want to row sum only on A,B and C? – Victor LE FRANC Jan 26 '22 at 15:51
  • 1
    Yes, than you could specifiy the columns like `data$points = rowSums(data[c("A", "B", "C")] == 2, na.rm = TRUE)` or with column index `data$points = rowSums(data[c(1,2,3)] == 2, na.rm = TRUE)` – Marco_CH Jan 26 '22 at 15:53
  • 1
    Thank you for your help, i was able to create my variable! – Victor LE FRANC Jan 26 '22 at 17:09