list of USA movies filtered by columns
dfUSARating <- dfUSAMovies[, c("rowNum","title", "genre", "rating", "vote")]
pulling rows genre contains Thriller
thrill <- dfUSARating %>% filter(str_detect(dfUSARating$genre, "Thriller"))
head(dfUSARating$genre, n=3)
[1] ['Documentary', 'Comedy', 'Drama', 'Fantasy', 'Mystery',
'Sci-Fi']
[2] ['Comedy', 'Horror', 'Sci-Fi']
[3] ['Biography', 'Drama', 'Sport']
There a repeats of genres, I want to filter the genre if it starts with Thriller only, not if the string contains thriller. Movies have multiple genres and am getting repeats.
dput(head(dfUSARating))
structure(list(rowNum = c(6L, 7L, 8L, 12L, 13L, 15L), genre =
structure(c(869L,
752L, 638L, 130L, 229L, 910L), .Label = c("['Action',
'Adventure', 'Animation', 'Comedy']",
"['Action', 'Adventure', 'Biography', 'Drama', 'History',
'War']",
"['Action', 'Adventure', 'Biography', 'Drama', 'History']", "
['Action', 'Adventure', 'Biography', 'History', 'Romance']",
"['Action', 'Adventure', 'Biography', 'History']", "['Action',
'Adventure', 'Comedy', 'Crime', 'Drama', 'Thriller']",
"['Comedy', 'Drama', 'Mystery']", "['Comedy', 'Drama', 'Romance',
'Fantasy']",
"['Comedy', 'Drama', 'Romance', 'Sci-Fi']", "['Comedy', 'Drama',
'Romance', 'Sport']",
"['Comedy', 'Drama', 'Romance', 'Thriller']", "['Comedy',
'Drama', 'Romance', 'War']",
"['Comedy', 'Drama', 'Romance', 'Western']", "['Comedy', 'Drama',
"['Western']"), class = "factor"), rating = c(5.3, 4.5, 7.8,
4.8, 7.1, 7.6)), row.names = c(NA, 6L), class = "data.frame")