0

I'm trying to create a function implementing a sliding window for iterating over strings, with specific window size and desired step size. From another StackOverflow thread, I've come up with the following thus far:

def sliding_window(string, window_size, step):
    str_lst = list(string)
    i = 0
    res = []
    for j in range(0, step):
        while (i - j + window_size) <= len(str_lst):
            res.append(''.join(str_lst[(i - j):(i - j + window_size)]))
            i += step
    
    return res

sliding_window(string='abcdefghijkl', window_size=5, step=3)
# Output: ['abcde', 'defgh', 'ghijk', 'hijkl']


As shown in the example above, if the length of the string does not round up to equal spaces, then the overlap of the substrings may be different (in the example the overlaps are of length [2, 2, 4]). What I wish in this case, is to modify the function so that instead of returning substrings that overlaps like they currently do, i.e. with lengths [2, 2, 4], I'd very much like to mean out the overlaps between the substrings, e.g. [2, 3, 3]. Any suggestions on how I can modify my function to achieve this would be very much appreciated!

OMMJREN
  • 57
  • 7

0 Answers0