In looking through the dynamic programming algorithm for computing the minimum edit distance between two strings I am having a hard time grasping one thing. To me it seems like given the two strings s
and t
inserting a character into s
would be the same as deleting a character from t
. Why then do we need to consider these operations separately when computing the edit distance? I always have a hard time computing the indices in the recurrence relation because I can't intuitively understand this part.
I've read through Skiena and some other sources but they all don't explain this part well. This SO link explains the insert and delete operations better than elsewhere in terms of understanding what string is being inserted into or deleted from but I still can't figure out why they aren't one and the same.
Edit: Ok, I didn't do a very good job of detailing the source of my confusion.
The way Skiena explains computing the minimum edit distance m(i,j) of the first i characters of a string s
and the first j characters of a string t
based on already having computed solutions to the subproblems is as follows. m(i,j) will be the minimum of the following 3 possibilities:
opt[MATCH] = m[i-1][j-1].cost + match(s[i],t[j]);
opt[INSERT] = m[i][j-1].cost + indel(t[j]);
opt[DELETE] = m[i-1][j].cost + indel(s[i]);
The way I understand it the 3 operations are all operations on the string s
. An INSERT means you have to insert a character at the end of string s
to get the minimum edit distance. A DELETE means you have to delete the character at the end of string s
to get the minimum edit distance.
Given s = "SU"
and t = "SATU"
INSERT and DELETE would be as follows:
Insert:
SU_
SATU
Delete:
SU
SATU_
My confusion was that an INSERT into s
is the same as a DELETION from t
. I'm probably confused on something basic but it's not intuitive to me yet.
Edit 2: I think this link kind of clarifies my confusion but I'd love an explanation given my specific questions above.