I'm trying to get agrep
to match seconds
to second
and not millisecond
, but there doesn't seem to be any value of costs
to accomplish this.
I am especially confused that there's no value of the cost for deletions
/insertions
that seems to do the trick -- as I see it, second
is one deletion from seconds
whereas millisecond
is one deletion and 5 insertions.
(warning that the lapply
might take a while... you would get the same result for length.out = 10
and 0:10
much quicker)
rng = c(seq(0, 1, length.out = 20), 0:100)
x = expand.grid(insertions = rng, substitutions = rng, deletions = rng)
units = c("millisecond", "second", "minute", "hour", "day",
"week", "month", "quarter", "year")
x$match = lapply(seq_len(nrow(x)), function(ii)
agrep('second', units, value = TRUE, costs = x[ii, ]))
x$match_which = sapply(x$match, paste, collapse = '|')
sort(table(x$match_which))
# millisecond|second|minute|hour|week|month|year
# 57
# millisecond|second|minute|hour|week|month|quarter|year
# 13276
# millisecond|second|month
# 23316
# millisecond|second|minute|month|quarter
# 37842
# millisecond|second|minute|quarter
# 251480
# millisecond|second|minute|hour|day|week|month|quarter|year
# 409865
# millisecond|second
# 1035725
What am I missing here? Is there no way to accomplish my task (match seconds
to second
and not millisecond
) with agrep
?