0

Take the following example data:

tag = [123,124,125]
rad = [[40.0, 33.0, 23.2], [22.3, 11.6, 20.3], [45.2, 96.2, 77.3]]

I made a pandas dataframe with tag and rad as the keys, where df['rad'][0]=[40.0, 33.0, 23.2]. I need each row value in rad to remain a list.

Now, I'm trying to append this data to an existing dataframe:

store_radial = pd.HDFStore('radials.h5')
df = pd.DataFrame('tag':tag,'radial':[rad]})
store_radial.put('tag', df, append=True, format='t', data_columns=True)

I've seen this method work in other posts here, but when I run it using my data, I get the following error:

TypeError: Cannot serialize the column [radial] because
its data contents are [mixed] object type

The types are the following:

radial    object
tag        int64
dtype: object

I believe my issue is caused by rad being a list. Is there a way to append to a .h5 file while keeping rad as a list? If there is a better way to do this, I'm all ears. Please let me know if I need to clarify.


Update: I have also tried the following and still get the same error:

df = pd.DataFrame({'radials':[rad_pro], 'tag':tag})
df['radials'].to_hdf('r%d_radials.h5' % run, key='radials', mode='a', append=True, format='table')

From what I understand, the dtype of rad needs to match what the individual values in the list are for this to work. The error is saying that inside radials, the dtypes are mixed. Which I don't quite understand why that is.

Clarification update: To clarify further, I want my dataframe to look like this:

In [15]: df
Out[15]: 
              radials    tag
0  [40.0, 33.0, 23.2]    123
1  [22.3, 11.6, 20.3]    124
2  [45.2, 96.2, 77.3]    125

From there, I want the appended .h5 to keep the format above, where df['radials'][i] is a list.

NoVa
  • 317
  • 3
  • 15
  • You should not use h5 for a column of lists. Suggest you break radials into multiple columns or rows. See `pd.DataFrame.explode()` – Kyle Nov 09 '20 at 19:21
  • @Kyle, I need to keep the columns of lists as they are. I do not necessarily need to use h5 format for this, but I do need to save and append the data to a larger file. – NoVa Nov 09 '20 at 19:24
  • don't know much about h5, but what if you convert your radials column to a string: df['radials'].astype(str) and then append it to your file. And when you read in the file again, you convert the column that is a string back to a list again with: df['radials'].str.strip('[]').str.split(',') – Sander van den Oord Nov 09 '20 at 21:11
  • @SandervandenOord, I tried that. The arrays are too long. – NoVa Nov 10 '20 at 02:33

0 Answers0