3

I have a Main List and a Sub List and I want to locate the indices of every occurrence of the Sub List that are found in the Main List, in this example, I want the following list of indices returned.

>>> main_list = [1,2,3,4,4,4,1,2,3,4,4,4]
>>> sub_list = [4,4,4]

>>> function(main_list, sub_list)
>>> [3,9]

Ideally, the function should also ignore fragments of the sub_list, in this case [4,4] would be ignored. Also, I expect the elements to all be single digit integers. Here is a second example, for clarity:

>>> main_list = [9,8,7,5,5,5,5,5,4,3,2,5,5,5,5,5,1,1,1,5,5,5,5,5]
>>> sub_list = [5,5,5,5,5]

>>> function(main_list, sub_list)
>>> [3,11,19]
  • 3
    what happens with `main_list = [4, 4, 4]` and `sub_list = [4, 4]`? – Moses Koledoye Jun 10 '16 at 23:19
  • Is your use case always with single digit elements? Because then you can make an easy regex-based solution. – wim Jun 10 '16 at 23:20
  • @MosesKoledoye I think that would return [0, 1] – iPhynx Jun 10 '16 at 23:21
  • Depending on your data, you might get some benefit from implementing a [string search algorithm](https://en.wikipedia.org/wiki/String_searching_algorithm) like Boyer-Moore or Knuth-Morris-Pratt, especially if `sub_list` is likely to be long or have a lot of almost-matches. – user2357112 Jun 10 '16 at 23:29
  • 4
    A naive solution `[i for i in range(len(main_list) - len(sub_list) + 1) if main_list[i:i+len(sub_list)] == sub_list]` – Padraic Cunningham Jun 10 '16 at 23:45
  • Padraic Cunningham, your solution worked perfectly with a small modification. Thank you! – Jeremy Higgins Jun 13 '16 at 18:18
  • The solutions in https://stackoverflow.com/questions/10106901/elegant-find-sub-list-in-list don't answer the question of locating the indices of the sublist. The solution in the comment above _does_ answer this. – Mick Sep 20 '18 at 15:21

3 Answers3

1

Maybe using strings is the way to go?

import re
original = ''.join([str(x) for x in main_list])
matching = ''.join([str(x) for x in sub_list])
starts = [match.start() for match in re.finditer(re.escape(matching), original)]

The only problem with this one is that it doesn't count for overlapping values

Aquiles
  • 841
  • 7
  • 13
  • The answer provided by Padraic Cunningham in the comments of the question I believe is much better and it does take into consideration the overlapping values. – Aquiles Jun 11 '16 at 00:13
0

You should be able to use a for loop, but then split it up into the length of you sub_list list, iterate through, and look for the sub lists in your main list. Try this:

main_list = [9,8,7,5,5,5,5,5,4,3,2,5,5,5,5,5,1,1,1,5,5,5,5,5]
sub_list = [5,5,5,5,5]

indices = []
for i in range(0, len(main_list)-len(sub_list)+1):
    temp_array = main_list[i:i+len(sub_list)]
    if temp_array == sub_list:
        indices.append(i)

print indices
nanoman
  • 51
  • 1
  • 9
0

Here's a recursive way to do this:

list = [9,8,7,5,5,5,5,5,4,3,2,5,5,5,5,5,1,1,1,5,5,5,5,5]

def seq(array):  # get generator on the list
    for i in range(0,len(array)):
        yield i

sq = seq(list) # get the index generator



def find_consecutive_runs(array): # Let's use generator - we are not passing index

    i=next(sq) # get the index from generator

    if len(array) > 5: # or 3, or 4, or whatever - get slice and proceed

        arr = array[:5] # slice 5 elements

        if all(x==arr[0] for x in arr): # all list elements are identical
            print i # we found the index - let's print it

        find_consecutive_runs(array[1:len(array)]) # proceed with recursion

find_consecutive_runs(list) # the actual call 
dmitryro
  • 3,463
  • 2
  • 20
  • 28