Reduction for Longest Common Subsequence (LCS)

Question

I am working on a problem regarding LCS of two strings, and I was wondering if there is any reduction from the general case of LCS to its binary version, i.e., by solving LCS for bit-strings we can also solve LCS with an arbitrary (but finite) alphabet cardinal.

It seems reasonable for me that such a reduction exists (based on the complexity of algorithms for various versions of the problem), however, I couldn't find something like that.

nice question, but why should the alphabet matter if the test is whether or not two particular letters are equal? — גלעד ברקן, Jun 03 '16 at 01:50
I don't understand what you mean by two particular letters are equal! in the binary version alphabet is {0,1}, however, I need the case in which alphabet is any arbitrary finite set (e.g English alphabet |{a,b,c,...,z}| = 26). — Nima, Jun 03 '16 at 02:13
What I mean is that the classic LCS algorithm is based on a simple test, `does string1[i] == string2[j] ?` So what difference would it make what kind of alphabet you have? — גלעד ברקן, Jun 03 '16 at 07:56
Oh, ok! So the thing is that I am working on sth that is related to LCS, and I could solve it for the binary version, now I am looking for the reduction to extend my solution to the general case! — Nima, Jun 03 '16 at 07:59
I don't think if that would be necessary, what I am concerned with is just a reduction from general case to the binary version! — Nima, Jun 03 '16 at 09:04
@גלעדברקן: It's still a valid (and IMHO interesting) question whether the question "Given strings A and B on the 3-letter alphabet {x, y, z}, do A and B have a common subsequence of length at least k?" can be answered correctly by an algorithm that can only compute LCSes of binary strings. With a binary alphabet, if we know p != q and q != r then we can infer p = r, but in a 3-character alphabet we are not allowed to make this inference, so the 3-character alphabet problem is (at first glance, at least) a harder problem. — j_random_hacker, Jun 03 '16 at 15:22
This is probably difficult, since there don't seem to be obvious transformations of the input strings that lead to predictable transformations of the output (LCS length). E.g. if we know LCS(A, B) has length k, then all we can say about LCS(A.A, B.B) is that it has length at least 2k -- but it could be larger (e.g. for A=`aba`, B=`baab`, LCS(A, B) = 2 but LCS(A.A, B.B) = 5 > 2*2). Similarly all we know about LCS(A.rev(A), B.rev(B)) is that it's >= 2k (same example works). — j_random_hacker, Jun 03 '16 at 16:00
Another operation would be to duplicate each character: so dup(`abc`) = `aabbcc` for example. But again all we have is that LCS(dup(A), dup(B)) >= 2*LCS(A, B): with the same example values of A and B as before, we find LCS(dup(A), dup(B)) = 6 > 2*2. There may be other more esoteric operations that can be done that *are* "respected" by the LCS function (that is, performing them on the inputs causes the value of the resulting LCS to be *exactly* expressible in terms of the LCS of the original strings), but I don't know them. (We do have, trivially, that LCS(A, B) = LCS(rev(A), rev(B)).) — j_random_hacker, Jun 03 '16 at 16:04

Reduction for Longest Common Subsequence (LCS)

0 Answers0