I have a dataset of binary responses (0, 1) to a number of questions like below. Each row indicates an individual and each column is a response to a question. "completed" indicates how many questions were reached. For example, if completed = 2 only q_1 and q_2 were responded too by that individual. I want to rescore each "q" column such than any column number greater than "completed" column is a 0, otherwise use the value from the corresponding "q" column.
have = data.frame(q_1 = c(1,0,1,1,0),
q_2 = c(1,1,1,1,0),
q_3 = c(0,0,1,1,0),
q_4 = c(1,0,0,1,1),
q_5 = c(1,0,0,0,1),
completed = c(2, 3, 2, 4, 1))
> have
q_1 q_2 q_3 q_4 q_5 completed
1 1 1 0 1 1 2
2 0 1 0 0 0 3
3 1 1 1 0 0 2
4 1 1 1 1 0 4
5 0 0 0 1 1 1
How can I get to this output? Would it be easier to transform the dataset?
> want
q_1 q_2 q_3 q_4 q_5 completed scored_1 scored_2 scored_3 scored_4 scored_5
1 1 1 0 1 1 2 1 1 0 0 0
2 0 1 0 0 0 3 0 1 0 0 0
3 1 1 1 0 0 2 1 1 0 0 0
4 1 1 1 1 0 4 1 1 1 1 0
5 0 0 0 1 1 1 0 0 0 0 0
This code will get the correct output. However, my real dataset is very large so I would need to be able to loop through the columns.
want = have %>%
mutate(scored_1 = ifelse(completed >= 1, q_1, 0),
scored_2 = ifelse(completed >= 2, q_2, 0),
scored_3 = ifelse(completed >= 3, q_3, 0),
scored_4 = ifelse(completed >= 4, q_4, 0),
scored_5 = ifelse(completed >= 5, q_5, 0))