62

Using the Python function syntax def f(**kwargs), in the function a keyword argument dictionary kwargs is created, and dictionaries are mutable, so the question is, if I modify the kwargs dictionary, is it possible that I might have some effect outside the scope of my function?

From my understanding of how dictionary unpacking and keyword argument packing works, I don't see any reason to believe it might be unsafe, and it seems to me that there is no danger of this in Python 3.6:

def f(**kwargs):
    kwargs['demo'] = 9

if __name__ == '__main__':
    demo = 4
    f(demo=demo)
    print(demo)     # 4

    kwargs = {}
    f(**kwargs)
    print(kwargs)   # {}

    kwargs['demo'] = 4
    f(**kwargs)
    print(kwargs)    # {'demo': 4}

However, is this implementation-specific, or is it part of the Python spec? Am I overlooking any situation or implementation where (barring modifications to arguments which are themselves mutable, like kwargs['somelist'].append(3)) this sort of modification might be a problem?

Paul
  • 10,381
  • 13
  • 48
  • 86
  • To me, your tests are enough to prove that this is safe with your implementation. Is it enough though? I'm curious to see the answers. – Right leg Aug 25 '17 at 14:29
  • 2
    @Rightleg The question came up in the context of a FOSS library function which is intended to support many implementations and use cases. I'm fairly convinced that it's safe but I don't have any iron-clad reasoning that would say, "It's a bug if this is unsafe in some implementation." – Paul Aug 25 '17 at 14:32

3 Answers3

70

It is always safe. As the spec says

If the form “**identifier” is present, it is initialized to a new ordered mapping receiving any excess keyword arguments, defaulting to a new empty mapping of the same type.

Emphasis added.

You are always guaranteed to get a new mapping-object inside the callable. See this example

def f(**kwargs):
    print((id(kwargs), kwargs))

kwargs = {'foo': 'bar'}
print(id(kwargs))
# 140185018984344
f(**kwargs)
# (140185036822856, {'foo': 'bar'})

So, although f may modify an object that is passed via **, it can't modify the caller's **-object itself.


Update: Since you asked about corner cases, here is a special hell for you that does in fact modify the caller's kwargs:

def f(**kwargs):
    kwargs['recursive!']['recursive!'] = 'Look ma, recursive!'

kwargs = {}
kwargs['recursive!'] = kwargs
f(**kwargs)
assert kwargs['recursive!'] == 'Look ma, recursive!'

This you probably won't see in the wild, though.

user2722968
  • 13,636
  • 2
  • 46
  • 67
  • 4
    I'd say the corner case is a special case of "the arguments themselves being mutable", but it's clever nonetheless. – Paul Aug 25 '17 at 15:54
  • 9
    @Paul - Indeed, no need for anything as exotic as a self-referential dictionary. If one of the input dictionary's elements is mutable (like a list) and that element is mutated inside the function (such as with `.append()`), then the input dictionary will be mutated. – John Y Aug 25 '17 at 16:58
  • 1
    @JohnY Yes, I used that very example in my question ;) – Paul Aug 25 '17 at 17:14
  • @Paul - Ah, so you did! There you go then. :) Besides being clever, the example in this answer showed me something I didn't realize before: That the keys in the input dictionary can have characters that are illegal in parameter names! "Look Ma, exclamation points! (And spaces, and whatever else!)" – John Y Aug 25 '17 at 17:34
  • Define "safe". Will it blow up your computer? Probably not. Will arbitrary modifications cause your function to do what you intend? Probably not. You need to be just as careful as with any other modifications to arguments of functions - and perhaps moreso because this would be an easy point for bugs in your Python to slip in. – Russia Must Remove Putin Aug 25 '17 at 17:38
  • There's also no specific guarantee that `**kwargs` will be a _mutable_ mapping, though it does in CPython. – jirassimok Apr 29 '20 at 22:06
17

For Python-level code, the kwargs dict inside a function will always be a new dict.

For C extensions, though, watch out. The C API version of kwargs will sometimes pass a dict through directly. In previous versions, it would even pass dict subclasses through directly, leading to the bug (now fixed) where

'{a}'.format(**collections.defaultdict(int))

would produce '0' instead of raising a KeyError.

If you ever have to write C extensions, possibly including Cython, don't try to modify the kwargs equivalent, and watch out for dict subclasses on old Python versions.

user2357112
  • 260,549
  • 28
  • 431
  • 505
  • Note: use `string.Formatter.vformat` if you really want this. – o11c Aug 25 '17 at 16:49
  • ... huh. I guess things have changed since my old Python2 code needed this. Of course, I even needed to use the undocumented internals to change the recursion limit ... – o11c Aug 25 '17 at 16:51
  • When is "sometimes" in this context? Is there a reference for this? – Paul Aug 25 '17 at 17:17
  • 2
    @Paul: "Sometimes" for the following reasons: 1) As far as I'm aware, this is an undocumented implementation detail, and I don't know if some CPython versions behave differently, or if future versions might change the behavior. 2) The dict subclass fix means that some objects that are technically dicts (specifically, subclass instances) aren't passed through directly. 3) If you do something like `c_func(a=1, **{b:2})`, a new dict will be created. – user2357112 Aug 25 '17 at 17:21
  • I think it currently happens consistently for any case where you provide a single non-subclass dict of keyword arguments to a C function that uses the C API equivalent of `kwargs`. This is just going off what I remember of the source code, though. I don't have a reference to cite. – user2357112 Aug 25 '17 at 17:35
4

Both of above answers are correct in stating that technically, mutating kwargs will never have an effect on the parent scopes.

But... that's not the end of the story. It is possible for a reference to kwargs to be shared outside of the function scope, and then you run into all the usual shared mutated state problems that you'd expect.

def create_classes(**kwargs):

    class Class1:
        def __init__(self):
            self.options = kwargs

    class Class2:
        def __init__(self):
            self.options = kwargs

    return (Class1, Class2)

Class1, Class2 = create_classes(a=1, b=2)

a = Class1()
b = Class2()

a.options['c'] = 3

print(b.options)
# {'a': 1, 'b': 2, 'c': 3}
# other class's options are mutated because we forgot to copy kwargs

Technically this answers your question, since sharing a reference to mutable kwargs does lead to effects outside of the function scope's.

I've been bitten multiple times by this in production code, and it's something that I explicitly watch out for now, both in my own code and when reviewing others. The mistake is obvious in my contrived example above, but it's much sneakier in real code when creating factory funcs that share some common options.

Nick Sweeting
  • 5,364
  • 5
  • 27
  • 37