0

NOTE: As suggested by some people, I reposted this question to the codereview site


I want to split a list using another list which contains the lengths of each split.

Eg.

>>> print list(split_by_lengths(list('abcdefg'), [2,1]))
... [['a', 'b'], ['c'], ['d', 'e', 'f', 'g']]
>>> print list(split_by_lengths(list('abcdefg'), [2,2]))
... [['a', 'b'], ['c', 'd'], ['e', 'f', 'g']]    
>>> print list(split_by_lengths(list('abcdefg'), [2,2,6]))
... [['a', 'b'], ['c', 'd'], ['e', 'f', 'g']]
>>> print list(split_by_lengths(list('abcdefg'), [1,10]))
... [['a'], ['b', 'c', 'd', 'e', 'f', 'g']]
>>> print list(split_by_lengths(list('abcdefg'), [2,2,6,5]))
... [['a', 'b'], ['c', 'd'], ['e', 'f', 'g']]

As you can notice, if the lengths list does not cover all the list I append the remaining elements as an additional sublist. Also, I want to avoid empty lists at the end in the cases that the lengths list produces more elements that are in the list to split.

I already have a function that works as I want:

def take(n, iterable):
    "Return first n items of the iterable as a list"
    return list(islice(iterable, n))

def split_by_lengths(list_, lens):
    li = iter(list_)
    for l in lens:
        elems = take(l,li)
        if not elems:
            break
        yield elems
    else:
        remaining = list(li)
        if remaining:
           yield remaining

But I wonder if there is a more pythonic way to write a function such that one.

Note: I grabbed take(n, iterable) from Itertools Recipes:

Community
  • 1
  • 1
VGonPa
  • 443
  • 1
  • 7
  • 12
  • 4
    This question appears to be off-topic because it belongs on http://codereview.stackexchange.com – jonrsharpe Apr 04 '14 at 08:39
  • I did not know codereview.stackexchange.com, but I don't see how this question does not fit to stackoferflow. However, if you believe this question belongs to codereview I can repost it there. – VGonPa Apr 04 '14 at 09:23
  • If you repost, leave a comment here with link to it -- so I can submit my answer. – martineau Apr 04 '14 at 09:31
  • Reposted [**here**](http://codereview.stackexchange.com/questions/46246/python-split-a-list-using-another-list-whose-items-are-the-split-lengths) – VGonPa Apr 04 '14 at 09:40

1 Answers1

5

You can do this using itertools.islice:

from itertools import islice

def split_by_lengths(seq, num):
    it = iter(seq)
    for x in num:
        out = list(islice(it, x))
        if out:
            yield out
        else:
            return   #StopIteration 
    remain = list(it)
    if remain:
        yield remain

Demo:

>>> list(split_by_lengths(list('abcdefg'), [2,1]))
[['a', 'b'], ['c'], ['d', 'e', 'f', 'g']]
>>> list(split_by_lengths(list('abcdefg'), [2,2]))
[['a', 'b'], ['c', 'd'], ['e', 'f', 'g']]
>>> list(split_by_lengths(list('abcdefg'), [2,2,6]))
[['a', 'b'], ['c', 'd'], ['e', 'f', 'g']]
>>> print list(split_by_lengths(list('abcdefg'), [1,10]))
[['a'], ['b', 'c', 'd', 'e', 'f', 'g']]

Shorter version of the above version, but note that unlike the first answer this won't short-curcuit as soon as the iterator is exhausted.

def split_by_lengths(seq, num):
    it = iter(seq)
    out =  [x for x in (list(islice(it, n)) for n in num) if x]
    remain = list(it)
    return out if not remain else out + [remain]
Ashwini Chaudhary
  • 244,495
  • 58
  • 464
  • 504
  • This is a nice answer, but produces empty lists if the lengths list contains more elements than the sequence to split E.g.: `print list(split_by_lengths(list('abcdefg'), [2,2,6,5]))` produces `[['a'], ['b', 'c'], ['d', 'e', 'f', 'g'], []] ` whose empty list at the end I want to avoid. I'll edit my question to include this case. – VGonPa Apr 04 '14 at 08:55
  • @VGonPa I think the output should be `[['a', 'b'], ['c', 'd'], ['e', 'f', ...` – Ashwini Chaudhary Apr 04 '14 at 09:01
  • @VGonPa Anyways I've updated the solution. – Ashwini Chaudhary Apr 04 '14 at 09:03
  • Yes, you are right, it was a mistake when copy paste: `print list(split_by_lengths(list('abcdefg'), [2,2,6,5]))` produces `[['a', 'b'], ['c', 'd'], ['e', 'f', 'g'], []]` in your code – VGonPa Apr 04 '14 at 09:05
  • Now it works fine, but please, could you specify how is this solution more pythonic? – VGonPa Apr 04 '14 at 09:16
  • @VGonPa I've added a smaller answer. – Ashwini Chaudhary Apr 04 '14 at 09:30