I'm trying to create a polars dataframe from a dictionary (mainDict) where one of the values of mainDict is a list of dict objects (nestedDicts). When I try to do this I get an error (see below) that I don't know the meaning of. However, pandas does allow me to create a dataframe using mainDict.
I'm not sure whether I'm doing something wrong, if it's a bug, or if this operation simply isn't supported by polars. I'm not too worried about finding a workaround as it should be straightforward (suggestions are welcome), but I'd like to do it this way if possible.
I'm on polars version 0.13.38 on google colab (problem also happens locally on VScode, with python version 3.9.6 and windows 10). Below is an example of code that reproduces the problem and its output. Thanks!
INPUT:
import polars as pl
import pandas as pd
template = { 'a':['A', 'AA'],
'b':['B', 'BB'],
'c':['C', 'CC'],
'd':[{'D1':'D2'}, {'DD1':'DD2'}]}
#create a dataframe using pandas
df_pandas = pd.DataFrame(template)
print(df_pandas)
#create a dataframe using polars
df_polars = pl.DataFrame(template)
print(df_polars)
OUTPUT:
a b c d
0 A B C {'D1': 'D2'}
1 AA BB CC {'DD1': 'DD2'}
---------------------------------------------------------------------------
ComputeError Traceback (most recent call last)
<ipython-input-9-2abdc86d91da> in <module>()
12
13 #create a dataframe using polars
---> 14 df_polars = pl.DataFrame(template)
15 print(df_polars)
3 frames
/usr/local/lib/python3.7/dist-packages/polars/internals/frame.py in __init__(self, data, columns, orient)
300
301 elif isinstance(data, dict):
--> 302 self._df = dict_to_pydf(data, columns=columns)
303
304 elif isinstance(data, np.ndarray):
/usr/local/lib/python3.7/dist-packages/polars/internals/construction.py in dict_to_pydf(data, columns)
400 return PyDataFrame(data_series)
401 # fast path
--> 402 return PyDataFrame.read_dict(data)
403
404
/usr/local/lib/python3.7/dist-packages/polars/internals/series.py in __init__(self, name, values, dtype, strict, nan_to_null)
225 self._s = self.cast(dtype, strict=True)._s
226 elif isinstance(values, Sequence):
--> 227 self._s = sequence_to_pyseries(name, values, dtype=dtype, strict=strict)
228 elif _PANDAS_AVAILABLE and isinstance(values, (pd.Series, pd.DatetimeIndex)):
229 self._s = pandas_to_pyseries(name, values)
/usr/local/lib/python3.7/dist-packages/polars/internals/construction.py in sequence_to_pyseries(name, values, dtype, strict)
241 if constructor == PySeries.new_object:
242 try:
--> 243 return PySeries.new_from_anyvalues(name, values)
244 # raised if we cannot convert to Wrap<AnyValue>
245 except RuntimeError:
ComputeError: struct orders must remain the same