I want to dummy-code whether some string is contained in another (which is structured). For example:
player <- c("Michael Jordan", "Steve Kerr", "Michael Jordan", "Toni Kukoc")
bulls <- c("Jordan, Michael Jeffrey", "Pippen, Scottie; Harper, Ron",
"Rodman, Dennis", "Kerr, Steve; Longley, Luc; Kukoc, Toni")
and create a new variable (say, included
) if words Michael and Jordan are present in bulls[1]
, Steve Kerr in bulls[2]
etc. The above should produce TRUE FALSE FALSE TRUE
. For generality, names and surnames are separated by commas, whereas a semicolon indicates multiple people in a single entry. Given that the object bulls
can feature longer versions of a name ("Jeffrey" in this case) but not the other way around, I suspect the solution might require some sort of an is.element check? I want to iterate this over a long list, what is the best approach?
p.s. I tried several stringr
verbs, however no luck so far (_view, _extract etc.)