I'm using TinyDB for a small CLI utility to manage personal document drafts. The database stores metadata for each draft; the file should be human-editable (so that I can add details manually), and for this reason I'd like to use YAML over JSON as the format.
I implemented a YamlStorage
class subclassing storages.Storage
as indicated in the TinyDB docs:
class TestYamlStorage(Storage):
"""
Store the data in a YAML file.
Written following the example at http://tinydb.readthedocs.io/en/latest/extend.html#write-a-custom-storage
"""
def __init__(self, filename): # (1)
super().__init__()
self.filename = filename
touch(filename)
def read(self):
with open(self.filename) as handle:
try:
data = yaml.load(handle.read())
return data
except yaml.YAMLError:
return None # (3)
def write(self, data):
print('writing data: {}'.format(data))
with open(self.filename, 'w') as handle:
yaml.dump(data, handle)
def close(self): # (4)
pass
Everything works fine when inserting only one element, or multiple elements at the same time using insert_multiple
:
db = TinyDB('db.yaml', storage=TestYamlStorage)
dicts = [
dict(name='Homer', age=38),
dict(name='Marge', age=34),
dict(name='Bart', age=10)
]
# this works as expected
db.insert_multiple(dicts)
The resulting db.yaml
:
_default:
1: {age: 38, name: Homer}
2: {age: 34, name: Marge}
3: {age: 10, name: Bart}
However, when inserting elements multiple times with insert
, the resulting YAML file is different:
db = TinyDB('db.yaml', storage=TestYamlStorage)
db.insert(dict(name='Homer', age=38))
db.insert(dict(name='Bart', age=10))
db.yaml
:
_default:
1: !!python/object/new:tinydb.database.Element
dictitems: {age: 38, name: Homer}
state: {eid: 1}
2: {age: 10, name: Bart}
The data in this format (apart from looking messier) seems to be not compatible with yaml.safe_load
(calling db.all()
returns []
). My interpretation is that the YAML serialization process is in some way "over-eager", i.e. that the Element
instance gets written to db.yaml
instead of the underlying data.
Is there something wrong with my code? I've tried to fiddle with PyYAML options, using a different YAML module (ruamel.yaml), and create a second YamlStorage class copying from the default JSONStorage, but without any difference.
Version info: Python 3.4.3, TinyDB 3.2.0, PyYAML 3.11. I posted a runnable MWE with all imports here.
Edit
After @Anthon's suggestion, I tried printing the YAML output to sys.stdout
immediately before dumping to file. The problem is reproduced also in this case. See notebook.