0

I have downloaded a huge data frame, and I can't share it due to copyright reasons. What I wanted to do was add two columns which help distinguish different types of newspapers into online or press, free or paid respectively etc.

I used the following code, which worked fine for many rows, but also skipped others that contained the same value, see screenshot of data frame.

library(dplyr)

pressAbo <- c("NZZ", "BU", "TA", "NLZ", "BZ", "BAZ", "AZM", "SGT")
pressBLV <- c("BLI")
pressSM <- c("SBLI", "TAS", "WOZ", "NZZS", "SAS")
pressPen <- c("ZWA")
onlineAbo <- c("NZZO", "NNTA", "NNBE", "NNBS", "SGTO", "LUZO")
onlineBLV <- c("BLIO")
onlinePen <- c("ZAWO")

query <- query %>%
  mutate(Gattung = case_when(
    (medium_code == pressAbo) ~ "Presse",
    (medium_code == pressBLV) ~ "Presse",
    (medium_code == pressSM) ~ "Presse",
    (medium_code == pressPen) ~ "Presse",
    (medium_code == onlineAbo) ~ "Online",
    (medium_code == onlineBLV) ~ "Online",
    (medium_code == onlinePen) ~ "Online",
    ),
    .after="medium_name")

(It is too detailed because another colum will be added that further differentiates between them)

This is what the code resulted in - some rows with the same value in "medium_code" were assiged the correct word, others an NA:

"medium_code" "NZZ" has been assigned "Gattung" "Presse" but not in all cases

I know my not being able to share the data makes it impossible to reproduce my problem, but maybe there is something obvious I don't see. I'm guessing the different cases are actually somehow different but I can't see that by eye. I have tried adding a space after the "medium_code", but it didn't help.

Valpal
  • 25
  • 2

0 Answers0