I'm intending to store Pandas DataFrames in MongoDB using the Python MongoEngine framework; coercing Pandas Dataframes to a Python Dict via df.to_list()
and storing them as a nested Document attribute. I'm attempting to minimize the amount of code I have to write to make the round trip from Pandas DataFrame to BSON and back by using a custom field type called DataFrameField
which is defined in this gist that coerces the pandas data frame to a python dict and back within the __set__
and __get__
methods.
This works great when setting the DataFrameField using dot notation, as in:
import pandas as pd
import numpy as np
from mongoengine import *
a_pandas_data_frame = pd.DataFrame({
'goods': ['a', 'a', 'b', 'b', 'b'],
'stock': [5, 10, 30, 40, 10],
'category': ['c1', 'c2', 'c1', 'c2', 'c1'],
'date': pd.to_datetime(['2014-01-01', '2014-02-01', '2014-01-06', '2014-02-09', '2014-03-09'])
})
class my_data(Document):
data_frame = DataFrameField() # defined in the referenced gist
foo = my_data()
foo.data_frame = a_pandas_data_frame
but if I pass a_pandas_data_frame
it to the constructor, I get:
>>> bar = my_data(data_frame = a_pandas_data_frame)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Users\MPGWRK-006\Anaconda2\lib\site-packages\mongoengine\base\document.py", line 116, in __init__
setattr(self, key, value)
File "C:\Users\MPGWRK-006\Anaconda2\lib\site-packages\mongoengine\base\document.py", line 186, in __setattr__
super(BaseDocument, self).__setattr__(name, value)
File "<stdin>", line 18, in __set__
ValueError: value is not a pandas.DataFrame instance
If i add a print statement like print value
to the __set__
method, and call the constructor, it prints:
['category', 'date', 'goods', 'stock']
which is the list of column names of the data frame (i.e. list(a_pandas_data_frame.columns)
). Is there any way to prevent the MongoEngine Document Constructor from passing something other than the object passed on to the __set__
method?
Thanks!
PS, I also asked this question at the [MongoEngine Repo] (https://github.com/MongoEngine/mongoengine/issues/1597) but there are about 300 open issues, so I'm not sure I expect a response on that forum any time soon...