How does this algorithm implement a sliding window?

Question

I'm currently prepping for a technical interview and practicing some data structures and algorithms questions with Python. There is a common question that asks you to find the longest substring in a string, such that that substring contains no repeated characters. Intuitively, I understand how to use a sliding window to solve this problem, which can be done with something like:

def longest_substring(s: str) -> int:
    
    longest_sub_string = 0
    
    if len(s) == 1:
        return 1
    
    for window_size in range(len(s) + 1, 0, -1):
        for i in range(len(s) - window_size + 1):
            window = s[i:i+window_size]
            if not self.contains_repeats(window) and len(window) > longest_sub_string:
                longest_sub_string = len(window)
    return longest_sub_string
    
    
def contains_repeats(s: str = None) -> bool:
    
    splt = list(s)
    if len(list(set(splt))) < len(splt):
        return True

However, this solution is not efficent for very long input strings taking something like O(n^2) time. I've found an alternative sliding window implementation:

def longest_substring(s: str) -> int:
    
    last_idxs = {}
    max_len = 0
    
    start_idx = 0
    
    for i in range(0, len(s)):
        
        if s[i] in last_idxs:
            start_idx = max(start_idx, last_idxs[s[i]] + 1)
        
        max_len = max(max_len, i-start_idx + 1)
        
        last_idxs[s[i]] = i
        
    return max_len

which solves the problem in linear time. I've picked apart what the code is doing and understand the individual parts, but cannot connect it to how a sliding window works, which is preventing me from being able to apply this approach to different problems. I could just memorize the code, but I'd like to understand how what's happeing in the second code block is similar to what's happening in the first. Can anyone explain this in a strighforward way that shows how this second variation implements a sliding window?

@Stef would be better imo to stitch the two comments as a single answer, since that's probably what OP's asking — Abhinav Mathur, Apr 25 '22 at 14:06
@Stef I guess the first algo can't become O(n^3) since a valid substring can't go over 26 characters. The first algo, in addition to not using a hashtable/set, does look at unrealistic substrings. One could make it linear, although with an unnecessary large constant, if I'm not mistaken. — user1984, Apr 25 '22 at 14:08
Oh, I know see that it's indeed O(n^3), since it doesn't break out of the the for loop when a repeat is found. The first algo is horrendous. Unreadable and inefficient :D — user1984, Apr 25 '22 at 14:11
Does the second function work? It returns 5 for `'aabbbcccccccccccccccccgdef'`. — wwii, Apr 25 '22 at 16:18
@wwii Isn't that the right answer, since 'cgdef' is the longest substring without repeats? — kcsquared, Apr 25 '22 at 16:24

score 2 · Accepted Answer · answered Apr 25 '22 at 14:14

I wouldn't call the first code a "sliding-window algorithm". It appears to simply be a bruteforce algorithm that tries all possible substrings. It might take time proportional to n³, because there are n² substrings to check, and applying contains_repeats might take time proportional to n per substring in the worst-case. Or rather, if k is the number of distinct characters in the alphabet, then contains_repeats can take time proportional to up to min(n, k); thus the first algorithm can take time up to n² min(n, k).

The second algorithm is a sliding-window algorithm. The window goes from index start_idx to index i. It takes time proportional to n because each of those two indices can only increase, never decrease, so the total number of iterations is at most 2n. Instead of trying all possible substrings, the algorithms has a varying-size "window" that "slides" from left to right (and never goes back).

Note that "sliding-window algorithm" is a term with at least two distinct meanings. In addition to this kind of string algorithms, it is sometimes used to refer to algorithms that use convolutions.

How does this algorithm implement a sliding window?

1 Answers1