0

I'm new at R and I am trying to remove the year of a film.

movie_title  <- c("carrie(2013)", "cars", "lesmiserables(2012)")

For example, here I would like to delete "(2013)" from the title Carrie. So, make "carrie" instead of "carrie(2013)". And apply it to all the similar tiles in the movie_title column in the data frame.

What should I do? Thanks!

1 Answers1

1

You will need to look up regex or regular expressions.

Using base r you can do this:

gsub("\\(\\d{4}\\)", "", movie_title)

With the stringr package

library(stringr)

str_remove(movie_title, "\\(\\d{4}\\)")

[1] "carrie"        "cars"          "lesmiserables"

Peter
  • 11,500
  • 5
  • 21
  • 31
  • Thank you! It worked. Do I understand it right that we use "\\" to specify what type of characters we would like to find/delete/etc? – Nikita Shmygin May 13 '20 at 23:23
  • 1
    The `\\ ` is used to "escape" a character. In regular expressions, `(` is used to do specific code-y things -- in this case, `\\(` indicates to R to look for *the character `(`*, not to do the thing that it normally does when it encounters `(`. – Aaron Montgomery May 14 '20 at 01:08
  • 1
    @NikitaShmygin if the answer answered your question it customary to tick the acceptance symbol at the side of the answer. Accepting an answer is important as it informs others that your issue is resolved, and pins the answer to the top so others reading your question read that answer first – Peter May 14 '20 at 05:42