1

This question is related to type evolution with jsonpickle (python)

Current state description:

I need to store an object to a JSON file using jsonpickle in python.

The object class CarState is generated by a script from another software component thus I can't change the class itself. This script automatically generates the __getstate__ and __setstate__ methods for the class that jsonpickle uses for serializing the object. The __getstate__ returns just a list of the values for each member variable, without the field names. Therefore jsonpickle doesn't store the field name, but only the values within the JSON data (see code example below)

The Problem:

Let's say my program needs to extend the class CarState for a new version (Version 2) by an additional field (CarStateNewVersion). Now If it loads the JSON data from version 1, the data isn't assigned to the correct fields.

Here's an example code demonstrating the problem. The class CarState is generated by the script and simplified here to show the problem. In Version 2 I update the class CarState with a new field (in the code snipped inserted as CarStateNewVersion to keep it simple)

#!/usr/bin/env python
import jsonpickle as jp

# Class using slots and implementing the __getstate__ method
# Let's say this is in program version 1
class CarState(object):
    __slots__ = ['company','type']
    _slot_types = ['string','string']

    def __init__(self):
        self.company = ""
        self.type = ""

    def __getstate__(self):
        return [getattr(self, x) for x in self.__slots__]

    def __setstate__(self, state):
        for x, val in zip(self.__slots__, state):
            setattr(self, x, val)

# Class using slots and implementing the __getstate__ method
# For program version 2 a new field 'year' is needed           
class CarStateNewVersion(object):
    __slots__ = ['company','year','type']
    _slot_types = ['string','string','string']

    def __init__(self):
        self.company = ""
        self.type = ""
        self.year = "1900"

    def __getstate__(self):
        return [getattr(self, x) for x in self.__slots__]

    def __setstate__(self, state):
        for x, val in zip(self.__slots__, state):
            setattr(self, x, val)

# Class using slots without the __getstate__ method
# Let's say this is in program version 1            
class CarDict(object):
    __slots__ = ['company','type']
    _slot_types = ['string','string']

    def __init__(self):
        self.company = ""
        self.type = ""

# Class using slots without the __getstate__ method
# For program version 2 a new field 'year' is needed      
class CarDictNewVersion(object):
    __slots__ = ['company','year','type']
    _slot_types = ['string','string','string']

    def __init__(self):
        self.company = ""
        self.type = ""
        self.year = "1900"



if __name__ == "__main__":

    # Version 1 stores the data
    carDict = CarDict()
    carDict.company = "Ford"
    carDict.type = "Mustang"
    print jp.encode(carDict)
    # {"py/object": "__main__.CarDict", "company": "Ford", "type": "Mustang"}

    # Now version 2 tries to load the data
    carDictNewVersion = jp.decode('{"py/object": "__main__.CarDictNewVersion", "company": "Ford", "type": "Mustang"}')
    # OK!
    # carDictNewVersion.company = Ford
    # carDictNewVersion.year = undefined
    # carDictNewVersion.type = Mustang


    # Version 1 stores the data
    carState = CarState()
    carState.company = "Ford"
    carState.type = "Mustang"
    print jp.encode(carState)
    # {"py/object": "__main__.CarState", "py/state": ["Ford", "Mustang"]}

    # Now version 2 tries to load the data    
    carStateNewVersion = jp.decode('{"py/object": "__main__.CarStateNewVersion", "py/state": ["Ford", "Mustang"]}')
    # !!!! ERROR !!!!
    # carDictNewVersion.company = Ford
    # carDictNewVersion.year = Mustang
    # carDictNewVersion.type = undefined
    try:
        carDictNewVersion.year
    except:
        carDictNewVersion.year = 1900

As you can see for the CarDict and CarDictNewVersion class, if __getstate__ isn't implemented, there's no problem with the newly added field because the JSON text also contains field names.

Question:

Is there a possibility to tell jsonpickle to not use __getstate__ and use the __dict__ instead to include the field names within the JSON data? Or is there another possibility to somehow include the field names?

NOTE: I can't change the CarState class nor the containing __getstate__ method since it is generated through a script from another software component. I can only change the code within the main method.

Or is there another serialization tool for python which creates human readable output and includes field names?


Additional Background info: The class is generated using message definitions in ROS, namely by genpy , and the generated class inherits from the Message class which implements the __getstate__ (see https://github.com/ros/genpy/blob/indigo-devel/src/genpy/message.py#L308)

Community
  • 1
  • 1
Stefan Profanter
  • 6,458
  • 6
  • 41
  • 73
  • The issue with using `__getstate__` to influence pickling is not specific to jsonpickle. Any library which respects the pickle protocol will exhibit the same behaviour. It's up to you to manage those methods for your needs. – Marcin Jan 21 '15 at 18:28

1 Answers1

0

Subclass CarState to implement your own pickle protocol methods, or register a handler with jsonpickle.

Marcin
  • 48,559
  • 18
  • 128
  • 201
  • Since I have different object types it would be a huge amount of additional work to subclass each class (there are about 30). I think registering a handler would be the best option. But this handler should be able to handle all the 30 classes without explicitly implementing it for each type... – Stefan Profanter Jan 21 '15 at 23:31