python find repeated substring in string

Question

I am looking for a function in Python where you give a string as input where a certain word has been repeated several times until a certain length has reached.

The output would then be that word. The repeated word isn't necessary repeated in its whole and it is also possible that it hasn't been repeated at all.

For example:

"pythonpythonp" => "python"

"hellohello" => "hello"

"appleapl" => "apple"

"spoon" => "spoon"

Can someone give me some hints on how to write this kind of function?

How does your program know what is a word? For example, how would it know that `'appleapl'` is not a single word? What about words that contain other words? — elethan, Dec 10 '16 at 15:52
I'll start with hints. If you're still stuck after trying those, post your attempted solution and we can give you more things to think about and try. Here are the hints. (1) First generate the possible sub-strings you want to search in each string. Is there a min or max length? Build a list or set of sub-strings from the input string. (2) Once you have the sub-strings to search for, try to identify the unique locations within the input string where the substrings appear. That should get you started! — Chris Johnson, Dec 10 '16 at 15:52

Tom Fuller · Answer 1 · 2016-12-10T16:03:39.857

9

You can do it by repeating the substring a certain number of times and testing if it is equal to the original string.

You'll have to try it for every single possible length of string unless you have that saved as a variable

Here's the code:

def repeats(string):
    for x in range(1, len(string)):
        substring = string[:x]

        if substring * (len(string)//len(substring))+(substring[:len(string)%len(substring)]) == string:
            print(substring)
            return "break"

    print(string)

repeats("pythonpytho")

edited Dec 10 '16 at 16:03

answered Dec 10 '16 at 15:55

Tom Fuller

5,291
7
33
42

fails with "spoon". – Jean-François Fabre Dec 10 '16 at 16:01
Thanks for pointing that out, I've fixed the error – Tom Fuller Dec 10 '16 at 16:03
fails for "repeater" – Abhinav Ralhan Feb 02 '21 at 10:11

score 0 · Answer 2 · answered Dec 10 '16 at 16:58

Start by building a prefix array.

Loop through it in reverse and stop the first time you find something that's repeated in your string (that is, it has a str.count()>1.

Now if the same substring exists right next to itself, you can return it as the word you're looking for, however you must take into consideration the 'appleappl' example, where the proposed algorithm would return appl . For that, when you find a substring that exists more than once in your string, you return as a result that substring plus whatever is between its next occurence, namely for 'appleappl' you return 'appl' +'e' = 'apple' . If no such strings are found, you return the whole word since there are no repetitions.

def repeat(s):
    prefix_array=[]
    for i in range(len(s)):
        prefix_array.append(s[:i])
    #see what it holds to give you a better picture
    print prefix_array

    #stop at 1st element to avoid checking for the ' ' char
    for i in prefix_array[:1:-1]:
        if s.count(i) > 1 :
            #find where the next repetition starts
            offset = s[len(i):].find(i)

            return s[:len(i)+offset]
            break

    return s


print repeat(s)

I have fixed the code a little to put out only the string that will be repeating most without the continued not repeated stuff until the point where the repetition begins... As this is already closed, you can find the code here: https://hastebin.com/iripiwupux.rb — GreatSUN, Jul 19 '18 at 08:49

python find repeated substring in string

2 Answers2

Linked