Given a suffix array, a TopCoder task from SRM 630 asks to find the minium number of distinct characters in the string that could form a string with the given suffix array. The full problem statement can be found on the TopCoder website.
The best solution I found is right here: https://github.com/ftiasch/acm-icpc/blob/6db1ed02a727611830b974a1d4de38bab8f390f9/topcoder/single-round-match/single-round-match-630/SuffixArrayDiv1.java
Here is the algorithm written by ftiasch:
public int minimalCharacters(int[] array) {
int n = array.length;
int[] position = new int[n + 1];
for (int i = 0; i < n; ++i) {
position[array[i]] = i;
}
position[n] = -1;
int[] minimum = new int[n + 1];
for (int i = n - 1; i >= 0; --i) {
minimum[i] = Integer.MAX_VALUE;
for (int j = i + 1; j <= n; ++j) {
boolean valid = true;
for (int x = i; x < j; ++x) {
for (int y = x + 1; y < j; ++y) {
valid &= position[array[x] + 1] < position[array[y] + 1];
}
}
if (valid && minimum[j] < Integer.MAX_VALUE) {
minimum[i] = Math.min(minimum[i], minimum[j] + 1);
}
}
}
return minimum[0];
}
I understand that this is a dynamic programming algorithm but how does it work? I really need a hand understanding it.
EDIT
Here is what ftiasch wrote me back:
hi Ariel,
First of all, thanks to your compliment. Frankly speaking, my solution is not the best solution to the problem. The optimal one runs in O(n) time but mine takes O(n^4). I just picked this idea during the contest because n is relatively small.
Keep in mind that same characters become continuous in the SA. Since the problem asked for the least number of characters, so I decided to use dynamic programming to partition the SA into consecutive segments so that each segments start with the same character.
Which condition is necessary for S[SA[i]] == S[SA[j]] assumed that i < j? The lexicographic comparison suggests that suffix(SA[i] + 1) should be smaller than suffix(SA[j] + 1). We can easily find that the condition is also sufficient.
Write to me if you have any other question. :)
EDIT1
We finally managed to make it work, thanks to David. Here is the linear time algorithm in java from David's Python version:
public int minimalCharacters(int[] array) {
int n = array.length, i;
if (n == 0)
return 0;
int[] array1 = new int[n + 1];
for (i = 0; i < n; i++)
array1[1 + i] = array[i];
int[] position = new int[n + 1];
for (i = 0; i < n + 1; i++)
position[array1[i]] = i;
int k = 1;
for (i = n; i > 1; i--) {
if (position[array1[i] + 1] <= position[array1[i - 1] + 1])
k++;
}
return k;
}