I am having trouble understanding why the solution of Leetcode's Repeated String match only goes up to q + 1 repeats of A (if A.length() < B.length()) for B to be a possible substring of A repeated.
I read other StackOverflow solutions as well as Leetcode discussion pages but I am still unable to fully understand the solution.
The algorithm is explained as:
Imagine we wrote S = A+A+A+.... If B is to be a substring of S, we
only need to check whether some S[0:], S[1:], ..., S[len(A) - 1:]
starts with B, as S is long enough to contain B, and S has period
at most len(A).
Now, suppose q is the least number for which len(B) <= len(A * q).
We only need to check whether B is a substring of A * q or A *
(q+1). If we try k < q, then B has larger length than A * q and
therefore can't be a substring. When k = q+1, A * k is already big
enough to try all positions for B; namely, A[i:i+len(B)] == B for i
= 0, 1, ..., len(A) - 1.
The implementation is as follows:
class Solution {
public int repeatedStringMatch(String A, String B) {
int q = 1;
StringBuilder S = new StringBuilder(A);
for (; S.length() < B.length(); q++) S.append(A);
if (S.indexOf(B) >= 0) return q;
if (S.append(A).indexOf(B) >= 0) return q+1;
return -1;
}
}
I understand that when A.length() < B.length(), B cannot be a substring, so we would need to keep appending A until A.length() is at least equal to B.length(). But once this is the case, why is it that we would only need to add one more copy of A to get the minimum number of repeats?
My intuition is that after A is repeated some number of times there is a pattern that is established and if B does not fall into that pattern/sequence of characters then no matter how many times you repeat A, B will not be a substring of the repeated A.
However, I just don't know why it has to be specifically the number of copies to match B's length or 1 more copy added after A.length() = B.length().
If someone could clear up this confusion for me, it would be much appreciated. Thank you.