0
def count_overlapping(sequence, sub):
    counts = 0
    n = len(sub)
    while sub in sequence:
        counts += 1
        sequence = sequence[(sequence.find(sub) + n-1):]
    return counts

input: sequence = agatabttagataagataagatagatabagata

input: sub = agata

output: 3

This is the required output but my program is giving out 4.How do I ignore the non repetitive ones.

Someone please guide me here.

Drazer
  • 67
  • 6
  • 1
    i found this, think it will help with https://stackoverflow.com/questions/41077268/python-find-repeated-substring-in-string/41077376 – Ejiofor May 22 '20 at 07:35
  • 2
    The formulation of the question is really unclear and partly self-contradictory. Do you mean that you're looking for the longest series of contiguous appearances of the substring? Or what exactly? Two answerers already read it in two completely different ways... – Thierry Lathuille May 22 '20 at 07:47
  • @ThierryLathuille sorry for being unclear. I was trying to find the number of repetitions of a sub string in a string but only the ones which are continuous appearances. I'll try to be more clear with my questions in future. – Drazer May 22 '20 at 09:49

2 Answers2

2

The simplest-but-no-so-efficient solution will be to multiply the substring each time, until you can't find it in the string anymore, and then you find the max repetitions:

s = "agatabttagataagataagatagatabagata"
sub = "agata"

counts = 0
while sub * (counts+1) in s:
    counts += 1

print(counts)

This gives 3.

Tomerikoo
  • 18,379
  • 16
  • 47
  • 61
1

Here is a regex based approach which uses re.sub to remove all repeating groups of the substring. Then, to find the number of substrings which are present, we simply divide the difference in length by the length of the substring.

sequence = "agatabttagataagataagatagatabagata"
out = re.sub(r'(?:agata){2,}', '', sequence)
num = (len(sequence) - len(out)) / len('agata')
print(num)

This prints: 3

Tim Biegeleisen
  • 502,043
  • 27
  • 286
  • 360
  • Thank you very much. I guess this one also works for my problem but the below solution is much more fitting to my program. – Drazer May 22 '20 at 09:55
  • @Drazer The answer you accepted is probably more efficient than what I have given you, but if you like or feel comfortable with regex, then this is a good option. – Tim Biegeleisen May 22 '20 at 10:02