I am given n
strings (n>=2 and n<=4) and each one is constructed using 2 letters only: a
and b
. In this set of strings I have to find the length of the longest common substring that is present in all the strings. A solution is guaranteed to exist. Let's see an example:
n=4
abbabaaaaabb
aaaababab
bbbbaaaab
aaaaaaabaaab
The result is 5 (because the longest common substring is "aaaab").
I don't have to print (or even know) the substring, I just need to print its length.
It is also given that the result cannot be greater than 60
, even though the length of each string can be as high as 13 000
.
What I tried is this: I found the smallest length of any string of the given strings and then I compared it with 60
and I chose the smallest value between the two as starting point
. Then I started taking sequences of the first string, and the length of each sequence of the first string is len
, where len
takes values from starting point
to 1
. At each iteration I take all possible sequences of the first string of length len
and I use it as pattern
. Using the KMP algorithm (thus, complexity of O(n+m)
), I iterated through all the other strings (from 2
to n
) and checked if pattern
is found in string i
. Whenever it isn't found, I break the iteration and try the next sequence available of length len
or, if there isn't any, I decrease len
and try all sequences that have as length the new, decreased value len
. But if it matches, I stop the program and print the length len
, since we started from the longest possible length, decreasing at each step, it is logical that the first match that we find represents the largest possible length. Here is the code (but it doesn't really matter since this method is not good enough; I know I shouldn't use using namespace std
but it doesn't really affect this program so I just didn't bother):
#include <iostream>
#include <string>
#define nmax 50001
#define result_max 60
using namespace std;
int n,m,lps[nmax],starting_point,len;
string a[nmax],pattern,str;
void create_lps() {
lps[0]=0;
unsigned int len=0,i=1;
while (i < pattern.length()) {
if (pattern[i] == pattern[len]) {
len++;
lps[i] = len;
i++;
}
else {
if (len != 0) {
len = lps[len-1];
}
else {
lps[i] = 0;
i++;
}
}
}
}
bool kmp_MatchOrNot(int index) {
unsigned int i=0,j=0;
while (i < a[index].length()) {
if (pattern[j] == a[index][i]) {
j++;
i++;
}
if (j == pattern.length()) {
return true;
}
else if (i<a[index].length() && pattern[j]!=a[index][i]){
if (j != 0) {
j = lps[j-1];
}
else {
i++;
}
}
}
return false;
}
int main()
{
int i,left,n;
unsigned int minim = nmax;
bool solution;
cin>>n;
for (i=1;i<=n;i++) {
cin>>a[i];
if (a[i].length() < minim) {
minim = a[i].length();
}
}
if (minim < result_max) starting_point = minim;
else starting_point = result_max;
for (len=starting_point; len>=1; len--) {
for (left=0; (unsigned)left<=a[1].length()-len; left++) {
pattern = a[1].substr(left,len);
solution = true;
for (i=2;i<=n;i++) {
if (pattern.length() > a[i].length()) {
solution = false;
break;
}
else {
create_lps();
if (kmp_MatchOrNot(i) == false) {
solution = false;
break;
}
}
}
if (solution == true) {
cout<<len;
return 0;
}
}
}
return 0;
}
The thing is this: the program works correctly and it gives the right results, but when I sent the code on the website, it gave a 'Time limit exceeded' error, so I only got half the points.
This leads me to believe that, in order to solve the problem in a better time complexity, I have to take advantage of the fact that the letters of the string can only be a
or b
, since it looks like a pretty big thing that I didn't use, but I don't have any idea as to how exactly could I use this information. I would appreciate any help.