2

I would like to calculate the percentage of error between two strings, that means if we assume that one string is the ground truth and the other string is a typed string, then I would like to calculate the number of mistakes in the typed string.

Let's make an example:

ground truth = "This is a test"
typed = " Thisi is atest"

In typed there are 2 errors (additional i and missing space).

I think this can be done using some distance metric. Is there a library in Java for calculating such an error rate?

machinery
  • 5,972
  • 12
  • 67
  • 118
  • *"Is there a library?"* [Questions asking us to recommend or find a software library is off-topic for Stack Overflow](https://stackoverflow.com/help/on-topic) – Andreas Apr 07 '20 at 23:32

1 Answers1

2

You are referring to the Levenshtein distance. It is implemented in Apache Commons Text library:

See here: http://commons.apache.org/proper/commons-text/

And here: https://commons.apache.org/sandbox/commons-text/jacoco/org.apache.commons.text.similarity/LevenshteinDistance.java.html

nCessity
  • 735
  • 7
  • 23