3

Given a binary string of length N(<=10^5), I want to find the length of the cycle of the string. The length of the cycle will be at most 1000 and at least 1.

Example:

110110110110 length of cycle is 3(pattern repeating is 110)

000000 length of cycle is 1(pattern repeating is 0)

1101101101 length of cycle is 3(pattern repeating is 110)

I have tried to understand Floyd's cycle detection algorithm but I'm unable to understand how to apply in on this question.

How do I solve this problem efficiently? (I want an algorithm that runs in O(NlogN) or better ).

  • 1
    Answered here: http://stackoverflow.com/questions/8253491/periodic-binary-strings/8257829#8257829 – MBo Mar 14 '15 at 10:36
  • Depends on how you define the question. Does 1101101101 have a cycle of 3? Or not because the last cycle isn't finished? (110 110 110 1) – Misch Mar 14 '15 at 10:53
  • ^Yes it would. I will edit the question – Dhruv Srivastava Mar 14 '15 at 10:55
  • Can you please post the original statement of your problem? (which I'm sure is from some programming competition or some online judge) – ale64bit Mar 14 '15 at 11:06

4 Answers4

4

Here is a linear solution to this problem:

  1. Let's compute prefix function for the given string(like in Knuth-Morris-Pratt's algorithm).

  2. The answer is always n - p[n], where n is the length of the given string and p[i] is the value of the prefix function for the i-th position in the string. Proof:

    • The period is not smaller than n - p[n]. It is the case because for any period k, s[i] = s[i + k] for any i. Thus, n - p[n] is at least k due to the definition of the prefix function.

    • The period is not greater than k = n - p[n]. It is the case because s[i] = s[i + k] for all i due to the definition of the prefix function, which implies that k is a period.

kraskevich
  • 18,368
  • 4
  • 33
  • 45
  • is there any proof of correctness for your algorithm? actually I don't just want the answer, I would actually like to understand how you came up with it! :) – Dhruv Srivastava Mar 14 '15 at 11:04
  • @user3918495 I have added a proof. – kraskevich Mar 14 '15 at 11:05
  • @kaktusito It does work when the cycle is not complete. For the `1101101101` the longest string that end in the last position and is a prefix of the given string is `1101101`, which means that `n - p[n] = 3`. – kraskevich Mar 14 '15 at 11:09
  • Yeah, I realized about it and deleted my comment. +1 – ale64bit Mar 14 '15 at 11:10
  • What would the prefix function look like for a string of bits? You've only got two possible values – Rup Mar 14 '15 at 11:12
  • 1
    @Rup prefix function is defined over any alphabet. It doesn't matter if it is `0` and `1` or letters. – kraskevich Mar 14 '15 at 11:13
1

Floyd's cycle detection algorithm is thought to be used in a slightly different problem, namely a graph, where there are cycles, but not the whole graph has to be a cycle.

As example, compare these two graphs:

1 -> 2 -> 3 -> 4 -> 1 -> ...

and

1 -> 2 -> 3 -> 4 -> 2 -> ...

Both have a cycle, but the second one has only a cycle on a part of the nodes (namely 1 doesn't appear in the cycle).

You are not interested in cycles as in example 2, only "full cycles".


Additionally, as you are working with bits, your algorithm will a little different than if you would be working with integers (for example). The reason is that you can compare many bits at once with only one comparison (as long as the total number of bits is <= than one integer).

Here is a possible idea how you could solve the problem:

To check if there is a cycle of 1, shift the integer by one, and compare with itself:

000000000000
 000000000000
-yyyyyyyyyyy-
=> Matches!

110110110110
>110110110110
-ynnynnynnyn-
=> Nope

So the 000000000000 has a cycle of 1, 110110110110 doesn't, so continue testing with 2:

110110110110
>>110110110110
--nynnynnynn--
=> Nope

Continue with 3:

110110110110
>>>110110110110
---yyyyyyyyy---
=> Matches!

Of course, you'll have to implement what I just described with bit arithmetics, I'll leave that up to you.

Misch
  • 10,350
  • 4
  • 35
  • 49
  • 1
    The algorithm you describe saves a constant factor thanks to the bit-level arithmetic, but asymptotically is not O(N lg N) or better (supposing arbitrary cycle len), as wanted by the OP. The comment @MBo above links to an efficient solution to this problem. – ale64bit Mar 14 '15 at 10:42
  • I just saw that I overlooked the size requirement given in the question, so yes this answer is not really correct... – Misch Mar 14 '15 at 10:42
  • Yes, but in the worst case I would be doing K comparisons for a string of length N, hence the complexity would be O(NK) which is not as efficient as I wanted. – Dhruv Srivastava Mar 14 '15 at 10:44
  • @kaktusito Isn't MBo's solution just the same as this? – Rup Mar 14 '15 at 10:44
  • @Misch Now that I see, it looks like. Sorry about that. But in both cases (yours and MBo's) you can make the whole process run in O(N) just by hashing the input string and updating the shift in O(1) thanks to the hash. What do you think? – ale64bit Mar 14 '15 at 10:46
  • 1
    I don't really get how you want to use hashing to solve the problem... Hash the input and compare it to what? – Misch Mar 14 '15 at 10:48
1

when it begins always with the first cycle it will be simple. you can do something like this:

public int GetCycleLength(string binary, out int cycles)
{
    for (int i = 1; i < 1000; i++)
    {
        if (binary.Length % i == 0)
        {
            cycles = 0;
            do
            {
                cycles++;
                if (cycles * i > binary.Length - i - 1)
                {
                    break;
                }
            }
            while (binary.Substring(cycles * i, i) == binary.Substring((cycles + 1) * i, i));
            cycles++;
            if (cycles * i == binary.Length)
            {
                return i;
            }
        }
    }
    cycles = 0;
    return 0;
}
HenrikD
  • 241
  • 2
  • 10
1

In addition to the answer I already have, I have another idea. Maybe it doesn't work at all, so correct me if I'm wrong (I couldn't find anything about this topic on Google). However, it doesn't work on a bit-level, so you have an overhead of 32 or 64...

Let's call the string we are analysing S.

You could maybe use the Knuth-Morris-Pratt algorithm (find a substring in a string in linear time) to find the S in the string SS (concatenate S twice). Of course you have to start your search at index 2. The index which the algorithm then returns is then the length of the cycle.


EDIT: As kaktusito mentioned, this won't work. However you can use the KMP algorithm (but change it a bit) to find the string S in the string S starting at index 2. The original algorithm will of course not find a match, but you can modify it, to keep searching (even if the substring you want to find is longer than the original string). Then, as soon as you match the substring to the end of the original string, you have found the length of a cycle (even if the substring is longer).

Misch
  • 10,350
  • 4
  • 35
  • 49