Python: Return all Indices of every occurrence of a Sub List within a Main List

Question

I have a Main List and a Sub List and I want to locate the indices of every occurrence of the Sub List that are found in the Main List, in this example, I want the following list of indices returned.

>>> main_list = [1,2,3,4,4,4,1,2,3,4,4,4]
>>> sub_list = [4,4,4]

>>> function(main_list, sub_list)
>>> [3,9]

Ideally, the function should also ignore fragments of the sub_list, in this case [4,4] would be ignored. Also, I expect the elements to all be single digit integers. Here is a second example, for clarity:

>>> main_list = [9,8,7,5,5,5,5,5,4,3,2,5,5,5,5,5,1,1,1,5,5,5,5,5]
>>> sub_list = [5,5,5,5,5]

>>> function(main_list, sub_list)
>>> [3,11,19]

what happens with `main_list = [4, 4, 4]` and `sub_list = [4, 4]`? — Moses Koledoye, Jun 10 '16 at 23:19
Is your use case always with single digit elements? Because then you can make an easy regex-based solution. — wim, Jun 10 '16 at 23:20
Depending on your data, you might get some benefit from implementing a [string search algorithm](https://en.wikipedia.org/wiki/String_searching_algorithm) like Boyer-Moore or Knuth-Morris-Pratt, especially if `sub_list` is likely to be long or have a lot of almost-matches. — user2357112, Jun 10 '16 at 23:29
A naive solution `[i for i in range(len(main_list) - len(sub_list) + 1) if main_list[i:i+len(sub_list)] == sub_list]` — Padraic Cunningham, Jun 10 '16 at 23:45
Padraic Cunningham, your solution worked perfectly with a small modification. Thank you! — Jeremy Higgins, Jun 13 '16 at 18:18
The solutions in https://stackoverflow.com/questions/10106901/elegant-find-sub-list-in-list don't answer the question of locating the indices of the sublist. The solution in the comment above _does_ answer this. — Mick, Sep 20 '18 at 15:21

score 1 · Answer 1 · answered Jun 11 '16 at 00:01

1

Maybe using strings is the way to go?

import re
original = ''.join([str(x) for x in main_list])
matching = ''.join([str(x) for x in sub_list])
starts = [match.start() for match in re.finditer(re.escape(matching), original)]

The only problem with this one is that it doesn't count for overlapping values

answered Jun 11 '16 at 00:01

Aquiles

841
7
13

The answer provided by Padraic Cunningham in the comments of the question I believe is much better and it does take into consideration the overlapping values. – Aquiles Jun 11 '16 at 00:13

score 0 · Answer 2 · answered Jun 10 '16 at 23:38

You should be able to use a for loop, but then split it up into the length of you sub_list list, iterate through, and look for the sub lists in your main list. Try this:

main_list = [9,8,7,5,5,5,5,5,4,3,2,5,5,5,5,5,1,1,1,5,5,5,5,5]
sub_list = [5,5,5,5,5]

indices = []
for i in range(0, len(main_list)-len(sub_list)+1):
    temp_array = main_list[i:i+len(sub_list)]
    if temp_array == sub_list:
        indices.append(i)

print indices

dmitryro · Answer 3 · 2016-06-11T00:37:49.470

Here's a recursive way to do this:

list = [9,8,7,5,5,5,5,5,4,3,2,5,5,5,5,5,1,1,1,5,5,5,5,5]

def seq(array):  # get generator on the list
    for i in range(0,len(array)):
        yield i

sq = seq(list) # get the index generator



def find_consecutive_runs(array): # Let's use generator - we are not passing index

    i=next(sq) # get the index from generator

    if len(array) > 5: # or 3, or 4, or whatever - get slice and proceed

        arr = array[:5] # slice 5 elements

        if all(x==arr[0] for x in arr): # all list elements are identical
            print i # we found the index - let's print it

        find_consecutive_runs(array[1:len(array)]) # proceed with recursion

find_consecutive_runs(list) # the actual call

Python: Return all Indices of every occurrence of a Sub List within a Main List

3 Answers3

Linked

Related