I have downloaded a huge data frame, and I can't share it due to copyright reasons. What I wanted to do was add two columns which help distinguish different types of newspapers into online or press, free or paid respectively etc.
I used the following code, which worked fine for many rows, but also skipped others that contained the same value, see screenshot of data frame.
library(dplyr)
pressAbo <- c("NZZ", "BU", "TA", "NLZ", "BZ", "BAZ", "AZM", "SGT")
pressBLV <- c("BLI")
pressSM <- c("SBLI", "TAS", "WOZ", "NZZS", "SAS")
pressPen <- c("ZWA")
onlineAbo <- c("NZZO", "NNTA", "NNBE", "NNBS", "SGTO", "LUZO")
onlineBLV <- c("BLIO")
onlinePen <- c("ZAWO")
query <- query %>%
mutate(Gattung = case_when(
(medium_code == pressAbo) ~ "Presse",
(medium_code == pressBLV) ~ "Presse",
(medium_code == pressSM) ~ "Presse",
(medium_code == pressPen) ~ "Presse",
(medium_code == onlineAbo) ~ "Online",
(medium_code == onlineBLV) ~ "Online",
(medium_code == onlinePen) ~ "Online",
),
.after="medium_name")
(It is too detailed because another colum will be added that further differentiates between them)
This is what the code resulted in - some rows with the same value in "medium_code" were assiged the correct word, others an NA:
"medium_code" "NZZ" has been assigned "Gattung" "Presse" but not in all cases
I know my not being able to share the data makes it impossible to reproduce my problem, but maybe there is something obvious I don't see. I'm guessing the different cases are actually somehow different but I can't see that by eye. I have tried adding a space after the "medium_code", but it didn't help.