0

I'd like for the adist function to work the same way it does for words as it does for characters. What I mean by this is I'd like a deletion/substitution/insertion to apply to a whole word instead of characters. For example, I want "Alert 12 went off at 3am" and "Alert 17 was heard at 3am" to have a Levenshtein Distance of 3 because there are three substitutions of words needed to get from one string to another. Thanks

  • So you want to count the different words? `strsplit` would get you most of the way there. – cory Jan 03 '20 at 12:52
  • 1
    read this [discussion](https://stackoverflow.com/questions/5055839/word-level-edit-distance-of-a-sentence) – phiver Jan 03 '20 at 12:53

1 Answers1

0

I guess you can try the following code to count different words

library(vecsets)
d <- length(vsetdiff(unlist(strsplit(s1," ")),unlist(strsplit(s2," "))))

such that

> d
[1] 3

DATa

s1 <- "Alert 12 went off at 3am"
s2 <- "Alert 17 was heard at 3am"
ThomasIsCoding
  • 96,636
  • 9
  • 24
  • 81