-1

I would like to search a data.frame column with string distances and convert them to numeric fields. I would do the same on twitter style dates such as '3 days ago' using the same function.

If I was starting with:

x <- c("5 days ago", "1 day ago", "6 days ago")

I would end up with:

x <- c(120, 24, 144)

Any help would be appreciated!

Roland
  • 127,288
  • 10
  • 191
  • 288
  • 1
    Do you have always the same unit, so that you only need to extract numbers or do units vary? I.e., is your example representative? – Roland Jun 26 '14 at 07:28

3 Answers3

1

Check stringr library and str_extract_all function

x <- c("5 days ago", "1 day ago", "6 days ago")
library(stringr)
x <- 24*as.numeric(str_extract_all(x, "\\d"))
David Arenburg
  • 91,361
  • 17
  • 137
  • 196
0

Try this:

strLine <- c("5 days ago", "1 day ago", "6 days ago")
x <- as.numeric(unlist(regmatches(strLine, gregexpr('\\(?[0-9]+', strLine)))) * 24
x
# [1] 120  24 144
Eldar Agalarov
  • 4,849
  • 4
  • 30
  • 38
0

If your data consist only "number days ago" or "number miles" you can use regular expressions:

> x <- c("5 days ago", "1 day ago", "6 days ago", "21.2 miles", "1 mile")
> x[grep(" day",x)] <- as.numeric(gsub("[ daysago]","",x[grep(" day",x)] ))*24
> x
[1] "120"        "24"         "144"        "21.2 miles" "1 mile"    
> x[grep(" mile",x)] <- as.numeric(gsub("[ miles]","",x[grep(" mile",x)] )) 
> x
[1] "120"  "24"   "144"  "21.2" "1"   
> x <- as.numeric(x)
> x
[1] 120.0  24.0 144.0  21.2   1.0
Pigeon
  • 423
  • 2
  • 8
  • This solution allows for the regex grouping with the grep command. In my case the `regmatches(strLine, gregexpr('\\(?[0-9]+', strLine))` @Eldar wrote was more useful to thow out averything but the digits, but can definitely see the need for gsub in similar circumstances. – user3778002 Jun 26 '14 at 08:59
  • The code I used just changed gsub to strip everything but the number. `x[grep("day",x)] <- as.numeric(gsub("[[:alpha:]|[:blank:]]","",x[grep("day",x)] )) * 24` – user3778002 Jun 26 '14 at 09:36