Based on the example, I'm assuming the search string needs to be found in the same order as given (i.e. ACB
isn't a valid find for ABC
).
General DP approach / hints:
The function we're trying to minimize is the distance so far, so this should be the value stored in each cell of your matrix.
For some position in the string and some position in the search string, we need to look back to all previous positions in the string for one position back in the search string. For all of these we need to add the distance to there and record the minimum.
To illustrate, assume a search string of A, B, C, D
. Then for ABC
in the search string and position i
in the string, we need to look at positions 0
through i-1
for AB
.
Given a string BACCD
and a search string BCD
, when looking at the last position of both, we'd have something like:
DP(BACCD, BCD) = min(4+DP(B, BC), 3+DP(BA, BC), 2+DP(BAC, BC), 1+DP(BACC, BC))
But DP(B, BC)
and DP(BA, BC)
are invalid since B
and BA
don't contain BC
and, more specifically, don't end with a C
(thus they can be assigned some arbitrary large value).
Once we get to the last character in the search string, the value would indicate we found the complete search string, ending at that position in the string, thus it should compared to the global minimum.
Optimization:
To get an O(m*n)
rather than O(m*n^2)
running time, it's worth noting that you can stop iterating backwards as soon as you see another of the current letter (because, any sequence up to that point is longer than the same sequence with only the last letter moved forward), i.e.:
Given a string ABCCD
and a search string ABC
, when checking the second C
, we can stop as soon as we get to the first C
(which is right away), since ABC
is shorter than ABCC
.
Side note:
I think one can do better than the DP approach, but if I were to suggest something else here, it would likely just be copied from / inspired by one of the answers to Find length of smallest window that contains all the characters of a string in another string.