1

I have two data sets: The first data set contains participants' numerical answers to questions:

data <- data.frame(Q1 = 1:5,
                   Q2 = rev(1:5),
                   Q3 = c(4, 5, 1, 2, 3))

The second data set serves as a reference table where the solutions are stored:

ref.table <- data.frame(Question = c("Q1", "Q2", "Q3"),
                        Solution = c("big", "big", "small"))

I would like to compare the two data sets and create a new data set that contains the binary information on whether the answer was correct (1) or incorrect (0). For this, answers 1, 2, 3 correspond to "small", and answers 4, 5 correspond to "big".

My attempt is the following:

accuracy <- data.frame(lapply(data, function(x) {ifelse(x >= 4 & ref.table$Solution[ref.table$Question == colnames(data)[x]] == "big", 1, 0)}))

But somehow, this only gives me the incorrect answers as 0, while the correct answers are NA.

Does anyone know how to solve this? Thank you!

aynber
  • 22,380
  • 8
  • 50
  • 63

1 Answers1

2

With tidyverse, loop across the columns, match the column name (cur_column()) with 'Question' column from 'ref.table', get the corresponding 'Solution' value, check if it is 'big' along with the value of the column >= 4 and coerce the logical to binary

library(dplyr)
data %>%
   mutate(across(everything(), ~ +(.x >=4 & 
    ref.table$Solution[match(cur_column(), ref.table$Question)] == 
        "big")))

-output

  Q1 Q2 Q3
1  0  1  0
2  0  1  0
3  0  0  0
4  1  0  0
5  1  0  0

Or in base R, loop over the column names in lapply, extract the column with [[, the logic applied with match is the same as above

data[] <- lapply(names(data), \(nm) +(data[[nm]] >=4 & 
    ref.table$Solution[match(nm, ref.table$Question)] == "big"))
akrun
  • 874,273
  • 37
  • 540
  • 662
  • 1
    Thank you for your help! I added an OR argument to make sure that the numbers corresponding to the "small" condition are also accounted for: `data %>% mutate(across(everything(), ~ +((.x >= 4 & ref.table$Solution[match(cur_column(), ref.table$Question)] == "big") | (.x < 4 & ref.table$Solution[match(cur_column(), ref.table$Question)] == "small"))))` – user20291515 Nov 02 '22 at 09:04