5

My problem is:

class A(object):
    def __init__(self):
        #init
    def __setstate__(self,state):
        #A __setstate__ code here            
    def __getstate__(self):
        #A __getstate__ code here
        return state

class B(A):
    def __init__(self):
        #creates many object variables here

A is from an external library.

Hard solution

This I would like to avoid

When pickling B, pickle of course uses class A's __setstate__, __getstate__ methods, so in order for pickle to work I should do something like this:

class B(A):
    def __init__(self):
        #creates many object variables here

    def __setstate__(self,state)
        A.__setstate__(self,state)
        #B __setstate__ code here
        #getting various variables from state for example
        self._a0 = state['a0']
        self._a1 = state['a1']
        #...
        self._a100 = state['a100']
        self._a101 = state['a101']

    def __getstate__(self):
        state = A.__getstate__(self)
        #B __setstate__ code here
        #filling state with various variables  
        #getting various variables from state for example
        state['a0'] =  self._a0
        state['a1'] =  self._a1
        #...
        state['a100'] =  self._a100
        state['a101'] =  self._a101           
        return state

My question is:

How can I avoid defining __setstate__ and __getstate__ in B so that pickle does the job of pickling variables by itself? All variables in B are of type that pickle may pickle(handle) by itself. So if B didn't inherit from A, it would be possible with good results:

b = B()
path = 'path.temp'
fout = open(path,'w')
pickler = pickl.Pickler(fout)

pickler.dump(b)
fout.close()

fin = open(path,'r')
upickler = pickl.Unpickler(fin)
b = unpickler.load()
fin.close()
#b has all variables

The obvious solution

class B(object):
    def __init__(self):
        #creates many object variables here
        a = A()            

However I would like B to inherit from A. Any idea how to solve this or at least automate pickling/unpickling variables in B?

The workaround solution:

As for automating pickling in the Hard Solution

Add to B a dictionary holding variables to pickle:

class B(A):
    __picklableObjects__ = {'_a0', '_a1', ... ,'_a101'}

    def __init__(self):
        #creates many object variables here
        A.__init__(self)
        self._a0 = ...
        ...
        self._a101 = ...

    @staticmethod
    def getPicklableObjects():
        return B.__picklableObjects__

    def __setstate__(self,state):
        A.__setstate__(self,state)
        for po in B.getPicklableObjects():
           __dict__[po] = state[po]

    def __getstate__(self):
        state = A.__getstate__(self)
        for po in B.getPicklableObjects():
            state[po] = copy.deepcopy(__dict__[po])
        return state

Any other ideas?

A's library:

Ok so for any of you interested A is graph_tool.Graph: A src code

line 786: class Graph(object)

...

line 1517: __getstate__

...

line 1533: __setstate__
akaihola
  • 26,309
  • 7
  • 59
  • 69
user779686
  • 59
  • 1
  • 3
  • Do you have access to the code of `A`? I usually write my `__getstate__` methods in some kind of 'opt-out' way, where I copy the instances `__dict__` property and delete or modify unpickable entries. This would nicely carry over to the daughter class. – David Zwicker Dec 20 '11 at 12:20
  • Yes, I have access to A. A is in an external library. Modifying A code would be problematic as I am not certain how it would impact rest of that library. So in other words you would propose something like in 'The workaround solution'? – user779686 Dec 20 '11 at 12:25
  • Have you considered using other serializers? – Karl Knechtel Dec 20 '11 at 13:25
  • I was more suggesting to modify the code of `A` in order to don't break the inheritance. I find it more convenient, if a class by default returns everything from its `__getstate__` method and only excludes problematic properties. If you would be able to modify `A` accordingly, you wouldn't need any code in `B` at all. – David Zwicker Dec 20 '11 at 13:29
  • I wanted to leave modifying A's code as an action of last resort. As A is in an external python library. Furthermore I would rather sacrifice inheriting from A rather than modifying A. – user779686 Dec 21 '11 at 09:27

2 Answers2

4

According to the documentation, when __getstate__ isn't defined, the instance's __dict__ is pickled so maybe, you can use this to define your own state methods as a combination of the A methods and the instance's __dict__:

import pickle

class A(object):
    def __init__(self):
        self.a = 'A state'

    def __getstate__(self):
        return {'a': self.a}

    def __setstate__(self, state):
        self.a = state['a']

class B(A):
    def __init__(self):
        A.__init__(self)
        self.b = 'B state'

    def __getstate__(self):
        a_state = A.__getstate__(self)
        b_state = self.__dict__
        return (a_state, b_state)

    def __setstate__(self, state):
        a_state, b_state = state
        self.__dict__ = b_state
        A.__setstate__(self, a_state)

b = pickle.loads(pickle.dumps(B()))
print b.a
print b.b
jcollado
  • 39,419
  • 8
  • 102
  • 133
  • Ok but what about the fact, that A defines __setstate__, __getstate__ in order not to pickle some things (don't know what they are, and don't know why it is --- the class is pretty much complicated). The solution you presented in __getstate__ duplicates variables, or at least properties of A (they are in a_state and b_state). I was also wondering, that with this solution I should check A's code thoroughly for potential blowback of pickling in b_state variables that are not pickled by A and them setting them. As some of them are overwritten by A.__setstate__ some not. – user779686 Dec 21 '11 at 09:41
  • @user779686 That's correct, part of the state in `b_state` is duplicated in `a_state`. Also, the ordering in `B.__getstate__` and `B.__setstate__` gives preference to `A` methods. However, I'd say that's appropriate to avoid breaking `A` behaviour. – jcollado Dec 21 '11 at 09:45
  • I'll experiment with this solution and see whether there are some negative consequences concerning A and potentially unpickling potentially unwanted variables/states. – user779686 Dec 21 '11 at 09:58
0

The default behavior of Pickle is __getstate__ is not defined is to pickle the contents of the objects __dict__ attribute - that is where the instance attributes are stored.

Therefore it looks like in your case, all you need to do is to make A's get and set state to preserve the values found in self.__dict__ and restore then at __setstate__ - this should preserve the instance variables of all instances of subclasses of A as well.

jsbueno
  • 99,910
  • 10
  • 151
  • 209