12

What is the preferred way to concatenate sequences in Python 3?

Right now, I'm doing:

import functools
import operator

def concatenate(sequences):
    return functools.reduce(operator.add, sequences)

print(concatenate([['spam', 'eggs'], ['ham']]))
# ['spam', 'eggs', 'ham']

Needing to import two separate modules to do this seems clunky.

An alternative could be:

def concatenate(sequences):
    concatenated_sequence = []
    for sequence in sequences:
        concatenated_sequence += sequence
    return concatenated_sequence

However, this is incorrect because you don't know that the sequences are lists.

You could do:

import copy

def concatenate(sequences):
    head, *tail = sequences
    concatenated_sequence = copy.copy(head)
    for sequence in sequences:
        concatenated_sequence += sequence
    return concatenated_sequence

But that seems horribly bug prone -- a direct call to copy? (I know head.copy() works for lists and tuples, but copy isn't part of the sequence ABC, so you can't rely on it... what if you get handed strings?). You have to copy to prevent mutation in case you get handed a MutableSequence. Moreover, this solution forces you to unpack the entire set of sequences first. Trying again:

import copy 

def concatenate(sequences):
    iterable = iter(sequences)
    head = next(iterable)
    concatenated_sequence = copy.copy(head)
    for sequence in iterable:
        concatenated_sequence += sequence
    return concatenated_sequence

But come on... this is python! So... what is the preferred way to do this?

ToBeReplaced
  • 3,334
  • 2
  • 26
  • 42
  • Have you seen `itertools.chain()`? I'm not sure it handles all of your intended use cases, though. – Wooble Jan 15 '13 at 16:33
  • 1
    Don't be afraid of standard library imports. It is very likely that some other module (even from the standard library) will grab `functools` and/or `operator` anyway. – Oleh Prypin Jan 15 '13 at 16:45

3 Answers3

12

I'd use itertools.chain.from_iterable() instead:

import itertools

def chained(sequences):
    return itertools.chain.from_iterable(sequences):

or, since you tagged this with you could use the new yield from syntax (look ma, no imports!):

def chained(sequences):
    for seq in sequences:
        yield from seq

which both return iterators (use list() on them if you must materialize the full list). Most of the time you do not need to construct a whole new sequence from concatenated sequences, really, you just want to loop over them to process and/or search for something instead.

Note that for strings, you should use str.join() instead of any of the techniques described either in my answer or your question:

concatenated = ''.join(sequence_of_strings)

Combined, to handle sequences fast and correct, I'd use:

def chained(sequences):
    for seq in sequences:
        yield from seq

def concatenate(sequences):
    sequences = iter(sequences)
    first = next(sequences)
    if hasattr(first, 'join'):
        return first + ''.join(sequences)
    return first + type(first)(chained(sequences))

This works for tuples, lists and strings:

>>> concatenate(['abcd', 'efgh', 'ijkl'])
'abcdefghijkl'
>>> concatenate([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
[1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> concatenate([(1, 2, 3), (4, 5, 6), (7, 8, 9)])
(1, 2, 3, 4, 5, 6, 7, 8, 9)

and uses the faster ''.join() for a sequence of strings.

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
2

what is wrong with:

from itertools import chain
def chain_sequences(*sequences):
  return chain(*sequences)
Samantha Atkins
  • 658
  • 4
  • 12
1

Use itertools.chain.from_iterable.

import itertools

def concatenate(sequences):
    return list(itertools.chain.from_iterable(sequences))

The call to list is needed only if you need an actual new list, so skip it if you just iterate over this new sequence once.

Oleh Prypin
  • 33,184
  • 10
  • 89
  • 99