3

I want to split a list into chunks using values of of another list as the range to split.

indices = [3, 5, 9, 13, 18]
my_list = ['a', 'b', 'c', ..., 'x', 'y', 'z']

So basically, split my_list from range:

my_list[:3], mylist[3:5], my_list[5:9], my_list[9:13], my_list[13:18], my_list[18:]

I have tried to indices into chunks of 2 but the result is not what i need.

[indices[i:i + 2] for i in range(0, len(indices), 2)]

My actual list length is 1000.

d789w
  • 357
  • 5
  • 19

3 Answers3

4

You could also do it using simple python.

Data

indices = [3, 5, 9, 13, 18]
my_list = list('abcdefghijklmnopqrstuvwxyz')

Solution

Use list comprehension.

[(my_list+[''])[slice(ix,iy)] for ix, iy in zip([0]+indices, indices+[-1])]

Output

[['a', 'b', 'c'],
 ['d', 'e'],
 ['f', 'g', 'h', 'i'],
 ['j', 'k', 'l', 'm'],
 ['n', 'o', 'p', 'q', 'r'],
 ['s', 't', 'u', 'v', 'w', 'x', 'y', 'z']]

Check if correct order of indices are extracted

dict(((ix,iy), (my_list+[''])[slice(ix,iy)]) for ix, iy in zip([0]+indices, indices+[-1]))

Output

{(0, 3): ['a', 'b', 'c'],
 (3, 5): ['d', 'e'],
 (5, 9): ['f', 'g', 'h', 'i'],
 (9, 13): ['j', 'k', 'l', 'm'],
 (13, 18): ['n', 'o', 'p', 'q', 'r'],
 (18, -1): ['s', 't', 'u', 'v', 'w', 'x', 'y', 'z']}
CypherX
  • 7,019
  • 3
  • 25
  • 37
  • I think your solution is incorrect, because z character is lost. – Leo77 May 02 '21 at 21:25
  • @Leo77 Thank you, for pointing it out. Updated the solution. Now it should give you `'z'` as well. – CypherX May 03 '21 at 03:58
  • 1. You don't need to use slice, you could just use my_list[ix : iy] instead. 2. Then slicing lists None can be used as an index: my_list[ix : ] and my_list[ix : None] are equivalent. As a result, you'll have something like this: `[my_list[i: j] for i, j in zip([0] + indices, indices + [None])]` – Leo77 May 25 '21 at 06:31
3

Can use itertools.zip_longest

[my_list[a:b] for a,b in it.zip_longest([0]+indices, indices)]

[['a', 'b', 'c'],
 ['d', 'e'],
 ['f', 'g', 'h', 'i'],
 ['j', 'k', 'l', 'm'],
 ['n', 'o', 'p', 'q', 'r'],
 ['s', 't', 'u', 'v', 'x', 'y', 'z']]

A little bit of code golf for fun:

map(my_list.__getitem__, map(lambda s: slice(*s), it.zip_longest([0]+indices, indices)))
rafaelc
  • 57,686
  • 15
  • 58
  • 82
2

One way using itertools.tee and pairwise:

from itertools import tee

def pairwise(iterable):
    "s -> (s0,s1), (s1,s2), (s2, s3), ..."
    a, b = tee(iterable)
    next(b, None)
    return zip(a, b)

chunks = [my_list[i:j] for i, j in pairwise([0, *indices, len(my_list)])]
print(chunks)

Output:

[['a', 'b', 'c'],
 ['d', 'e'],
 ['f', 'g', 'h', 'i'],
 ['j', 'k', 'l', 'm'],
 ['n', 'o', 'p', 'q', 'r'],
 ['s', 't', 'u', 'v', 'w', 'x', 'y', 'z']]

If numpy is an option, use numpy.array_split, which is meant for this:

import numpy as np

np.array_split(my_list, indices)

Output:

[array(['a', 'b', 'c'], dtype='<U1'),
 array(['d', 'e'], dtype='<U1'),
 array(['f', 'g', 'h', 'i'], dtype='<U1'),
 array(['j', 'k', 'l', 'm'], dtype='<U1'),
 array(['n', 'o', 'p', 'q', 'r'], dtype='<U1'),
 array(['s', 't', 'u', 'v', 'w', 'x', 'y', 'z'], dtype='<U1')]
Chris
  • 29,127
  • 3
  • 28
  • 51