filter rows where a columns strings start with a specific word in R?

Question

list of USA movies filtered by columns

    dfUSARating <- dfUSAMovies[, c("rowNum","title", "genre", "rating", "vote")]

pulling rows genre contains Thriller

    thrill <- dfUSARating %>% filter(str_detect(dfUSARating$genre, "Thriller"))
    head(dfUSARating$genre, n=3) 
    [1] ['Documentary', 'Comedy', 'Drama', 'Fantasy', 'Mystery', 
    'Sci-Fi']
    [2] ['Comedy', 'Horror', 'Sci-Fi']                                    
    [3] ['Biography', 'Drama', 'Sport']

There a repeats of genres, I want to filter the genre if it starts with Thriller only, not if the string contains thriller. Movies have multiple genres and am getting repeats.

    dput(head(dfUSARating))
    structure(list(rowNum = c(6L, 7L, 8L, 12L, 13L, 15L), genre = 
    structure(c(869L, 
    752L, 638L, 130L, 229L, 910L), .Label = c("['Action', 
    'Adventure', 'Animation', 'Comedy']", 
    "['Action', 'Adventure', 'Biography', 'Drama', 'History', 
    'War']", 
    "['Action', 'Adventure', 'Biography', 'Drama', 'History']", " 
     ['Action', 'Adventure', 'Biography', 'History', 'Romance']", 
    "['Action', 'Adventure', 'Biography', 'History']", "['Action', 
    'Adventure', 'Comedy', 'Crime', 'Drama', 'Thriller']", 
    "['Comedy', 'Drama', 'Mystery']", "['Comedy', 'Drama', 'Romance', 
    'Fantasy']", 
    "['Comedy', 'Drama', 'Romance', 'Sci-Fi']", "['Comedy', 'Drama', 
    'Romance', 'Sport']", 
    "['Comedy', 'Drama', 'Romance', 'Thriller']", "['Comedy', 
    'Drama', 'Romance', 'War']", 
    "['Comedy', 'Drama', 'Romance', 'Western']", "['Comedy', 'Drama',  
    "['Western']"), class = "factor"), rating = c(5.3, 4.5, 7.8, 
     4.8, 7.1, 7.6)), row.names = c(NA, 6L), class = "data.frame")

TarJae · Answer 1 · 2021-12-01T20:39:51.120

1

Use ^ it is an Anchor and indicates string start:

thrill <- dfUSARating %>% filter(str_detect(dfUSARating$genre, "^Thriller"))

edited Dec 01 '21 at 20:39

answered Dec 01 '21 at 20:33

TarJae

72,363
6
19
66

it doesn't return any objects, I think its because it has square brackets? ['Documentary', 'Comedy', 'Drama', 'Fantasy', 'Mystery', 'Sci-Fi'], is there a way to clean that out of the string? – RNewbie Dec 01 '21 at 21:25
That is possible. There is sure a way, but I need some rows of the data: Write in your console `dput(head(dfUSARating)` in your console and copy and paste the output to your question as edit. :) – TarJae Dec 01 '21 at 21:37
hope thats what you need, took beginning and end. too long. – RNewbie Dec 01 '21 at 22:19
This is ok but there is an issue `Error in as.character.factor(x) : malformed factor`. And under your question there is line with Share Edit Follow Reopen . Click edit and paste the code at the end of your question not in the comments section. – TarJae Dec 01 '21 at 22:22
just pasted it. – RNewbie Dec 01 '21 at 22:26

filter rows where a columns strings start with a specific word in R?

1 Answers1