I'm trying to pickle (using the dill extension) a workflow object from pyutilib.workflow
like so using python 2.7. The end objective here is to be able to insert these workflow objects into a MongoDB database and pulled out on the other end when needed:
from pyutilib import workflow
import testworkflow
from bson.binary import Binary
import pickle
import dill
import weakref
A = testworkflow.testTask()
w = workflow.Workflow()
w.add(A)
with open('w.dill', 'wb') as f:
scriptbytes = dill.dump(w, f)
script.close()
testworkflow.py
only contains testTask()
, which is written as follows:
import pyutilib.workflow
class testTask(pyutilib.workflow.Task):
def __init__(self, *args, **kwds):
pyutilib.workflow.Task.__init__(self, *args, **kwds)
self.inputs.declare('x')
self.inputs.declare('y')
self.outputs.declare('z')
def execute(self):
self.z = self.x + self.y
But when I attempt to execute it to serialize the workflow object, I get a massive traceback list from the pickle.py
file, at the very bottom of which is simply "AssertionError".
It seems to have troubles with things like:
File "/usr/lib/python2.7/pickle.py", line 649, in save_dict
self._batch_setitems(obj.iteritems())
File "/usr/lib/python2.7/pickle.py", line 681, in _batch_setitems
save(v)
File "/usr/lib/python2.7/pickle.py", line 286, in save
f(self, obj) # Call unbound method with explicit self
File "/usr/local/lib/python2.7/dist-packages/dill/dill.py", line 905, in save_weakref
pickler.save_reduce(_create_weakref, (refobj,), obj=obj)
File "/usr/lib/python2.7/pickle.py", line 405, in save_reduce
self.memoize(obj)
File "/usr/lib/python2.7/pickle.py", line 244, in memoize
assert id(obj) not in self.memo
The above chunk of traceback is, seriously, about 1% of the total list. Several tracebacks are to the same line of code, so is it a circular reference problem? I'm absolutely new to this type of project and I've searched all over for other related questions but none really seem to be quite relevant enough.
Am I missing some newer libraries? Is there a better way to do this?
EDIT: as per Martijn Pieters' helpful comment
Pickling is a recursive process, which is why you see certain lines repeated. The process ends up back at an object that was pickled before (id(obj) in self.memo is true only if the object was already processed).
So how can I stop this condition from being triggered? Why can't pickling automatically ignore already-serialized chunks as a base case in recursion?
EDIT 2: 'dill.detect.trace(true)' traceback:
T4: <class 'pyutilib.workflow.workflow.Workflow'>
D2: <dict object at 0x7ffff23955c8>
T4: <class 'pyutilib.workflow.port.InputPorts'>
T4: <class 'pyutilib.workflow.port.Port'>
D2: <dict object at 0x7ffff20a9b40>
R1: <weakref at 0x7ffff238aaf8; to 'Workflow' at 0x7ffff2399d90>
F2: <function _create_weakref at 0x7ffff238c410>
D2: <dict object at 0x7ffff20a9e88>
D2: <dict object at 0x7ffff2395c58>
T4: <class 'argparse.ArgumentParser'>
D2: <dict object at 0x7ffff20a3050>
F2: <function _compile at 0x7ffff7ed81b8>
D2: <dict object at 0x7ffff20a3398>
T4: <class 'argparse._HelpAction'>
D2: <dict object at 0x7ffff20a36e0>
T4: <class 'argparse._ArgumentGroup'>
D2: <dict object at 0x7ffff20a3c58>
D2: <dict object at 0x7ffff20a34b0>
D2: <dict object at 0x7ffff20a3168>
D2: <dict object at 0x7ffff20a3280>
T4: <class 'argparse._StoreFalseAction'>
T4: <class 'argparse._AppendConstAction'>
T4: <class 'argparse._StoreTrueAction'>
T4: <class 'argparse._CountAction'>
T4: <class 'argparse._StoreConstAction'>
T4: <class 'argparse._VersionAction'>
T4: <class 'argparse._StoreAction'>
T4: <class 'argparse._SubParsersAction'>
T4: <class 'argparse._AppendAction'>
D2: <dict object at 0x7ffff20a35c8>
F1: <function identity at 0x7ffff23a1d70>
F2: <function _create_function at 0x7ffff2389e60>
Co: <code object identity at 0x7ffff4c118b0, file "/usr/lib/python2.7/argparse.py", line 1591>
F2: <function _unmarshal at 0x7ffff2389cf8>
D4: <dict object at 0x7ffff4c1a050>
D2: <dict object at 0x7ffff20abd70>
D2: <dict object at 0x7ffff20a3910>
T4: <class 'argparse.HelpFormatter'>
T4: <class 'pyutilib.workflow.port.OutputPorts'>
D2: <dict object at 0x7ffff20a3b40>
D2: <dict object at 0x7ffff20ab050>
D2: <dict object at 0x7ffff2395d70>
D2: <dict object at 0x7ffff20a5050>
D2: <dict object at 0x7ffff2395e88>
T4: <class 'pyutilib.workflow.task.EmptyTask'>
D2: <dict object at 0x7ffff20a7398>
D2: <dict object at 0x7ffff20ab280>
R1: <weakref at 0x7ffff238aba8; to 'EmptyTask' at 0x7ffff20a6290>
T4: <class 'pyutilib.workflow.connector.DirectConnector'>
D2: <dict object at 0x7ffff20ab4b0>
D2: <dict object at 0x7ffff2395b40>
R1: <weakref at 0x7ffff238aaa0; to 'testTask' at 0x7ffff23999d0>
T4: <class 'testworkflow.testTask'>
D2: <dict object at 0x7ffff2398d70>
D2: <dict object at 0x7ffff2395a28>
R1: <weakref at 0x7ffff238aaa0; to 'testTask' at 0x7ffff23999d0>
D2: <dict object at 0x7ffff20a9d70>
D2: <dict object at 0x7ffff20a97f8>
R1: <weakref at 0x7ffff238ab50; to 'EmptyTask' at 0x7ffff2399fd0>
D2: <dict object at 0x7ffff20a5168>
D2: <dict object at 0x7ffff20a5280>
R1: <weakref at 0x7ffff238ab50; to 'EmptyTask' at 0x7ffff2399fd0>
D2: <dict object at 0x7ffff20a55c8>
D2: <dict object at 0x7ffff20a5910>
D2: <dict object at 0x7ffff20a5c58>
D2: <dict object at 0x7ffff20a7280>
D2: <dict object at 0x7ffff20a5a28>
D2: <dict object at 0x7ffff20a56e0>
D2: <dict object at 0x7ffff20a57f8>
D2: <dict object at 0x7ffff20a5b40>
F1: <function identity at 0x7ffff23a1de8>
D4: <dict object at 0x7ffff4c1a050>
D2: <dict object at 0x7ffff20b4280>
D2: <dict object at 0x7ffff20a5e88>
D2: <dict object at 0x7ffff20a7168>
D2: <dict object at 0x7ffff20a9c58>
D2: <dict object at 0x7ffff20ab168>
D2: <dict object at 0x7ffff2395910>
D2: <dict object at 0x7ffff20a5398>
D2: <dict object at 0x7ffff20a75c8>
D2: <dict object at 0x7ffff20a54b0>
D2: <dict object at 0x7ffff20a74b0>