0

I would like to separate the dates inside of text in my data frame. My data look like this:

tt <- structure(list(V1 = c("(Q)üfür (2013)", "'Bi atlayip çikicam' cümlesini fazla ciddiye aldiysak zaar (2016)", 
"A'dan Z'ye (o biçim) (1975)", "Gün ortasinda karanlik (Anne) (1990)"
), V2 = c("Ilker Savaskurt", "Bugra Gülsoy", "Ahmet Mekin", 
"Yavuzer Çetinkaya")), .Names = c("V1", "V2"), row.names = c(80404L, 
90699L, 34694L, 53178L), class = "data.frame")

I used this script to separate dates from text.

pattern <- "[()]"
tt$info <- strsplit(tt$V1,pattern)
tt$Title <-sapply(tt$info, `[[`, 1)
tt$Year <- sapply(tt$info, function(m) (m)[2])

It gives the dates but there are some texts that have more than one parentheses. Dates are always end of the text so I need to change the script to only get second parenthesis.

I have checked other questions in here but I couldn't come up with a solution. Thanks in advance.

eabanoz
  • 251
  • 3
  • 17

2 Answers2

2

By using regex you don't need to split the string. Try this

tt$year=gsub(".*\\(([0-9]{4})\\).*","\\1", tt$V1)

tt
#>                                                                      V1
#> 80404                                                    (Q)üfür (2013)
#> 90699 'Bi atlayip çikicam' cümlesini fazla ciddiye aldiysak zaar (2016)
#> 34694                                       A'dan Z'ye (o biçim) (1975)
#> 53178                              Gün ortasinda karanlik (Anne) (1990)
#>                      V2 year
#> 80404   Ilker Savaskurt 2013
#> 90699      Bugra Gülsoy 2016
#> 34694       Ahmet Mekin 1975
#> 53178 Yavuzer Çetinkaya 1990

Explanation: The regex matches 4 digits in a pair of brackets. gsub() extracts the matched digits.

TC Zhang
  • 2,757
  • 1
  • 13
  • 19
1

An option using stringi's stri_extract_last_regex which captures the last group of text between parenthesis

library(stringi)
stri_extract_last_regex(tt$V1, "(?<=\\().*?(?=\\))")
#[1] "2013" "2016" "1975" "1990"
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
  • I would like to delete these date and gap before them from V1 @Ronak . How can I do it? – eabanoz Feb 12 '19 at 07:12
  • @eabanoz Sorry, I do not have my system with me right now so wouldn't be able to help you immediately. You can ask a new question if you need immediate help. – Ronak Shah Feb 12 '19 at 09:14