I have a long dataset with students grades and subjects. I want to keep a long dataset, but I want to add a column that tells me how many Fs a student had in their humanity courses (English & History) and their STEM courses (biology & math). I also want the same for Ds, Cs, Bs, and As.
I know I could explicitly spell this out, but in the future, they might have other subjects (like adding Chemistry to STEM) or completely different categories, like Foreign Languages, so I want it to be scalable.
I know how to get all combinations of columns, and I know how to do to each part manually--but I don't know how to combine the two. Any help would be greatly appreciated!
#Sample data
library(tidyverse)
student_grades <- tibble(student_id = c(1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4, 5, 5),
subject = c(rep(c("english", "biology", "math", "history"), 4), NA, "biology"),
grade = c(1, 2, 3, 4, 5, 4, 3, 2, 2, 4, 1, 1, 1, 1, 2, 3, 3, 4))
#All combinations of grades and subjects
all_subject_combos <- c("eng|his", "bio|math")
all_grades <- c("F", "D", "C",
"B", "A")
subjects_and_letter_grades <- expand.grid(all_subject_combos, all_grades)
all_combos <- subjects_and_letter_grades %>%
unite("names", c(Var1, Var2)) %>%
mutate(names = str_replace_all(names, "\\|", "_")) %>%
pull(names)
#Manual generation of numbers of Fs by subject
#This is what I want the results to look like, but with all other letter grades
student_grades %>%
group_by(student_id) %>%
mutate(eng_his_F = sum((case_when(
str_detect(subject, "eng|his") & grade == 1 ~ 1,
TRUE ~ 0)), na.rm = TRUE),
bio_math_F = sum((case_when(
str_detect(subject, "bio|math") & grade == 1 ~ 1,
TRUE ~ 0)), na.rm = TRUE)) %>%
ungroup()
Ideally, this would be scalable for any number of subject combos and wouldn't require me to write out the same code for Ds, Cs, Bs, and As. Thank you!