How does dask.delayed handle mutable inputs? explains it with delayed
but what should I do if I need the input to be mutated?
I have been trying out Dask and I need to understand how dictionaries are mutated when they are called using distributed.Client.get
vs sequentially (normal way).
Sequential
def foo(dictionary):
dictionary['foo'] = 'foo was called'
def bar(dictionary):
dictionary['bar'] = 'bar was called'
dictionary = {}
print(dictionary) # {}
foo(dictionary)
print(dictionary) # {'foo': 'foo was called'}
bar(dictionary)
print(dictionary) # {'foo': 'foo was called', 'bar': 'bar was called'}
This works the way I expect it to, the dictionary is mutated and I get two keys after the calls to foo
and bar
.
Dask
from dask.distributed import Client
client = Client(processes=False)
def foo(dictionary):
dictionary['foo'] = 'foo was called'
def bar(dictionary):
dictionary['bar'] = 'bar was called'
dictionary = {}
dsk = {'foo': (foo, dictionary), 'bar':(bar, dictionary)}
client.get(dsk, ['foo', 'bar'])
print(dictionary) # {}
Why is this returning an empty dict? Why is that not mutated? I noticed the dictionary
dict has different id(dictionary)
inside each functions, so I understand it is a copy.
Is it safe to assume that every function gets its own copy of the objects passed to it? So I can mutate them within the function and have the one at global untouched? Is this understanding correct?