4

I'm trying to coerce dates from two formats into a single one that I can easily feed into as.Date. Here's a sample:

library(dplyr)
df <- data_frame(date = c("Mar 29 2017 9:30AM", "5/4/2016"))

I've tried this:

df %>% 
  mutate(date = gsub("([A-z]{3}) (\\d{2}) (\\d{4}).*", 
                     paste0(which(month.abb == "\\1"),"/\\2","/\\3"), date))

But it gave me this:

      date
1 /29/2017
2 5/4/2016

but I want this!

      date
1 3/29/2017
2 5/4/2016

It looks like when I use month.abb == "\\1", it doesn't use the capturing group output ("Mar"), it just uses the caller text ("\\1"). I want to do this in regex if possible. I know you can do it another way but want to be slick.

Any ideas?

Zafar
  • 1,897
  • 15
  • 33

1 Answers1

1

Here is one way with gsubfn

library(gsubfn)
df$date <- gsubfn("^([A-Za-z]{3})\\s+(\\d{2})\\s+(\\d{4}).*", function(x, y, z) 
                  paste(match(x, month.abb),y, z, sep="/"), df$date)
df$date
#[1] "3/29/2017" "5/4/2016" 

Or sub in combination with gsubfn

sub("(\\S+)\\s+(\\S+)\\s+(\\S+).*", "\\1/\\2/\\3", 
      gsubfn("^([A-z]{3})", setNames(as.list(1:12), month.abb), df$date))
#[1] "3/29/2017" "5/4/2016" 
akrun
  • 874,273
  • 37
  • 540
  • 662