0

Having

 agrep('timothy', c('timo','tim','timoth', 'timothys'), max.distance = 0.01, value=TRUE)

I want to output the original string and all possible results together in a data frame as below.

Original Replace1 Replace2
timothy  timoth   timothys

Is this possible or is there a better function to use?

Rtab
  • 123
  • 10

1 Answers1

1

I'd personally keep it in "long" format vs wide (you can always transformit later):

data.frame(
  original = "timothy",
  replacement = agrep('timothy', c('timo','tim','timoth', 'timothys'), max.distance = 0.01, value=TRUE),
  stringsAsFactors=FALSE
)
##   original replacement
## 1  timothy      timoth
## 2  timothy    timothys

You likely want to do this more than once, so I'd make that a function. And, since the output of agrep() can be character(0), we need to handle that so we'll add in a helper function, too:

`%|l0%` <- function(x, y) if (length(x) == 0) y else x

agrep_to_data_frame <- function(pattern, x, max.distance=0.01, costs=NULL) {
  data.frame(
    original = pattern,
    replacement = agrep(pattern, x, max.distance = max.distance, value=TRUE) %|l0% NA_character_,
    stringsAsFactors=FALSE
  )
} 

And, now it's a single call which you can use in purrr::map2() or mapply(), etc.

agrep_to_data_frame('timothy', c('timo','tim','timoth', 'timothys'))
##   original replacement
## 1  timothy      timoth
## 2  timothy    timothys
hrbrmstr
  • 77,368
  • 11
  • 139
  • 205