1

I am trying to remove every character repeated over 2 times from an extremely long string. So, for example, the word Terrrrrrific becomes Terrific.

Now my question is, how do I filter out repeats that include more than a single character the same way, i.e. if I have Words words words words words I want to filter it down to words words, however, it might be something less sensible, such as abcdabcdabcdabcdabcd which should become abcdabcd.

I do suspect that I should use a suffix tree, but I'm not sure how to go at the algorithm exactly.

ElGavilan
  • 6,610
  • 16
  • 27
  • 36
Kanadaj
  • 962
  • 9
  • 25
  • What you're looking for are also known as "tandem repeats" (due to a related task involving DNA sequences). When you allow more than one character, you have to define carefully what you mean by a repeat: e.g. `words words words words words` also contains 3 (overlapping) repeats of the string `words words words`. – j_random_hacker Jun 30 '15 at 11:55

1 Answers1

0

I don't know, Is this efficient algorithm for you but you can do this:

  1. Choose length for finding repeats
  2. Then for every start point from 0 to length-1 go through string
  3. Maintain stack (you use disjoint substrings and push on stack if top two from stack is different from them)
encrypt
  • 77
  • 1
  • 11