I'm looking for an algorithm that addresses the LCS problem for two strings with the following conditions:
Each string consists of English characters and each character has a weight. For example:
sequence 1 (S1): "ABBCD" with weights [1, 2, 4, 1, 3]
sequence 2 (S2): "TBDC" with weights [7, 5, 1, 2]
Suppose that MW(s, S)
is defined as the maximum weight of the sub-sequence s in string S
with respect to the associated weights. The heaviest common sub-sequence (HCS) is defined as:
HCS = argmin(MW(s, S1), MW(s, S2))
The algorithm output should be the indexes of HCS in both strings and the weight. In this case, the indexes will be:
I_S1 = [2, 4] --> MW("BD", "ABBCD") = 7
I_S2 = [1, 2] --> MW("BD", "TBDC") = 6
Therefore HCS = "BD", and weight = min(MW(s, S1), MW(s, S2)) = 6.