7

Consider a function defined as:

def fun(a, *args):
    print(type(args), args)

When called, it packs the extra positional arguments as a tuple.

>>> fun(2, 3, 4)
<class 'tuple'> (3, 4)

I want to achieve a similar thing outside of function arguments. So, I thought I could achieve the same thing with extended iterable unpacking, but it always packs things as a list and never as a tuple:

# RHS is a tuple
>>> (a, *args) = (2, 3, 4)
>>> type(args)
<class 'list'>
>>> args
[3, 4]  #but args is not a tuple!

Making it no different from:

# RHS is a list
>>> (a, *args) = [2, 3, 4]
>>> type(args)
<class 'list'>
>>> args
[3, 4]

I can understand that this is how it is proposed in the PEP as well.

I am asking if there is another way to achieve what I want.

Of course, I could convert args to tuple by later doing:

>>> args = tuple(args)
>>> args
(3, 4)

If this cannot be achieved directly during the assignment.

Mad Physicist
  • 107,652
  • 25
  • 181
  • 264
2020
  • 2,821
  • 2
  • 23
  • 40

2 Answers2

3

The closest way I know (that I'll admit, doesn't directly answer your question) is just to slice the tuple:

t = (2,3,4)
a, args = t[0], t[1:]

>>> type(args)
<class 'tuple'>

Yes, it's verbose, but at least it avoids the intermediate list.

Carcigenicate
  • 43,494
  • 9
  • 68
  • 117
  • Lets see if we get a more cleaner solution. The PEP specifically mentions this method as less clean and less efficient: _Many algorithms require splitting a sequence in a "first, rest" pair. With the syntax,_ `first, rest = seq[0], seq[1:]` _is replaced by the cleaner and probably more efficient:_ `first, *rest = seq` – 2020 Dec 16 '19 at 16:24
  • 1
    @inquisitiveOne By all means. I'm not saying that this is the best solution, but it's the only one I can think of at the moment. I'd be curious if it is actually noticeably less efficient though. I can't offhand think of a reason why one would be more efficient, unless there's a difference in how much work is being offloaded to C in each. – Carcigenicate Dec 16 '19 at 16:27
  • 1
    On the other hand, if constructing a new tuple with `t[1:]` is the bottleneck in your code, your code is almost certainly efficient enough. – chepner Dec 16 '19 at 19:58
  • 1
    @Carcigenicate: I believe the difference in cost is only in fixed overhead, so it doesn't affect big-O runtime; `first, rest = seq[0], seq[1:]` involves significantly more bytecode (11 instructions, vs. four for `first, *rest = seq`), but the work done would be roughly the same either way. As it happens, at least for `tuple`s, `first, rest = seq[0], seq[1:]` is reliably faster in microbenchmarks (on my CPython 3.8.0 Linux x64)), as it can use dedicated `tuple` slicing code, where unpacking uses generic iterator consumption code to build the `list`. `first, *rest` is more generic. – ShadowRanger Dec 16 '19 at 20:13
1

There are some additional considerations here to keep in mind. The type of the RHS is not restricted to list or tuple. You can unpack an arbitrary iterable:

a, *args = itertools.repeat('hi', 3)

There is no good reason that args should be either type in particular. It makes sense that the result is a list since it should be able to accumulate an arbitrary number of elements, but there is no major relevant difference in the C code, at least in CPython.

Function arguments pretty much have to be a tuple because the function can return the packed object and a closure. You do not want to be able to modify args in this case:

def func(arg, *args):
    assert args
    def blah():
        return args[0]
    return args, blah

This is a bit contrived, but illustrates why args can be considered public and should not be mutable in a function.

All that being said, here is a general purpose method for getting the unpacking you want, regardless of the RHS type:

t = (2, 3, 4)
it = iter(t)
a, args = next(it), tuple(it)
Mad Physicist
  • 107,652
  • 25
  • 181
  • 264
  • That's not a strong reason why "Function arguments pretty much have to be a tuple"; it's not like you couldn't add `args = tuple(args)` as the first line if you needed that behavior. I suspect they're `tuple`s only because they're usually small, and the CPython reference interpreter has some optimizations for small `tuple`s, and because it makes CPython C code easier; you don't have to worry that the `tuple` might be resized or changed during processing, even if you call out to Python level code (which can happen implicitly from simple stuff like length checks). – ShadowRanger Dec 16 '19 at 19:52
  • 1
    Note: I like that you point out that `a, *args = x` can't be dependent on the type of `x`. I do think it's weird they chose `list` since it's exactly as variable as the function argument case; it's probably slightly faster for truly variable inputs to convert to `list`, but not enough to really matter. Side-note: In 3.8, you can one-line the general purpose method to: `a, args = next(it := iter(t)), tuple(it)`, or even `a, args = next(it := iter(t)), (*it,)`. Not really recommending it, but it's an option. – ShadowRanger Dec 16 '19 at 20:04
  • @ShadowRanger. I do point out that neither type is particularly more justifiable than the other, given that the internal implementations are nearly identical. I would guess that the specific choices here are more historic at this point than any concrete justification. – Mad Physicist Dec 16 '19 at 20:56
  • 1
    I did eventually figure out why `list` is preferred from a CPython perspective: There's no convenient/efficient way to *trim* a `tuple` in place. The implementation of `first, *rest, last = x` is roughly `it = iter(x)`, `first = next(it)`, `rest = list(it)`, `last = rest[-1]`, `del rest[-1:]` (with refcounting optimizations so `del rest[-1:]` is effectively free). The last step there can't be legally done with `tuple` without additional leaving memory permanently wasted (longer than the lifetime of the `tuple` even, since small `tuple`s have a free list that would continue holding the memory). – ShadowRanger Dec 16 '19 at 22:01
  • 1
    This isn't a problem for functions, where `*args` is never followed by additional positional arguments, so it can unconditionally collect all of the remaining arguments, and never needs to trim them down again. So while `first, *rest = x` doesn't really benefit from `list`iness, `*rest, last = x` does. – ShadowRanger Dec 16 '19 at 22:03