I have data that looks like this. In each column, there are value/keys of varying different lengths. Some rows are also NaN.
like match
0 [{'timestamp', 'type'}] [{'timestamp', 'type'}]
1 [{'timestamp', 'comment', 'type'}] [{'timestamp', 'type'}]
2 NaN NaN
I want to split these lists into their own columns. I want to keep all the data (and make it NaN if it is missing). I've tried following this tutorial and doing this:
df1 = pd.DataFrame(df['like'].values.tolist())
df1.columns = 'like_'+ df1.columns
df2 = pd.DataFrame(df['match'].values.tolist())
df2.columns = 'match_'+ df2.columns
col = df.columns.difference(['like','match'])
df = pd.concat([df[col], df1, df2],axis=1)
I get this error.
Traceback (most recent call last):
File "link to my file", line 12, in <module>
df1 = pd.DataFrame(df['like'].values.tolist())
File "/usr/local/lib/python3.9/site-packages/pandas/core/frame.py", line 509, in __init__
arrays, columns = to_arrays(data, columns, dtype=dtype)
File "/usr/local/lib/python3.9/site-packages/pandas/core/internals/construction.py", line 524, in to_arrays
return _list_to_arrays(data, columns, coerce_float=coerce_float, dtype=dtype)
File "/usr/local/lib/python3.9/site-packages/pandas/core/internals/construction.py", line 561, in _list_to_arrays
content = list(lib.to_object_array(data).T)
File "pandas/_libs/lib.pyx", line 2448, in pandas._libs.lib.to_object_array
TypeError: object of type 'float' has no len()
Can someone help me understand what I'm doing wrong?