Does anyone know of edit-distance algorithm that only counts substitutions and insertions. So basically, it would be Levenshtein Distance algorithm without deletions.
Asked
Active
Viewed 1,969 times
1
-
What's your question exactly? – But I'm Not A Wrapper Class Oct 10 '14 at 16:00
-
I suppose I was wondering if there are any algorithms that I never heard of that do exactly what I explained above. or if you know a way to separately count the deletions involved in the edit in the Levenshtein distance, that would also be helpful. – kchoi Oct 10 '14 at 16:35
2 Answers
0
You can use almost the same dynamic programming solution that is used for computing normal Levenshtein distance, but without transitions that correspond to deletions.

kraskevich
- 18,368
- 4
- 33
- 45
-
So, Levenshtein distance is computed by `min(Lev[i-1][j], Lev[i][j-1], Lev[i-1][j-1]) + 1` When you say skip the transition corresponding to deletion, are you saying remove Lev[i-1][j] in the recurrence relation which corresponds to deletion? – kchoi Oct 10 '14 at 16:33
-
It depends on whether the operations can be performed on both string or on the first one only. In the former case, yes. In the latter one, nothing should be changed at all because deletion from one string is equivalent to an insertion into another one. – kraskevich Oct 10 '14 at 16:45
0
Say your Levenshtein Distance algorithm is the following:
For each i= 1...M
For each j = 1...N
//min(deletion, insertion, match/substitution)
D(i,j) = min(D(i-1,j) + 1, D(i,j-1) + 1, D(i-1,j-1) + (X(i)=Y(j) : 0 ? 2))
The part that counts the deletions should be removed. Leaving you with:
For each i= 1...M
For each j = 1...N
//min(insertion, match/substitution)
D(i,j) = min(D(i,j-1) + 1, D(i-1,j-1) + (X(i)=Y(j) : 0 ? 2))
Note: This particular algorithm scores substitution with 2 points and the other two operations (deletion, insertion) as 1 point. There are many variations that score differently.

But I'm Not A Wrapper Class
- 13,614
- 6
- 43
- 65