When I convert a numpy array to a pandas data frame pandas changes uint64 types to object types if the integer is greater than 2^63 - 1.
import pandas as pd
import numpy as np
x = np.array([('foo', 2 ** 63)], dtype = np.dtype([('string', np.str_, 3), ('unsigned', np.uint64)]))
y = np.array([('foo', 2 ** 63 - 1)], dtype = np.dtype([('string', np.str_, 3), ('unsigned', np.uint64)]))
print pd.DataFrame(x).dtypes.unsigned
dtype('O')
print pd.DataFrame(y).dtypes.unsigned
dtype('uint64')
This is annoying as I can't write the data frame to a hdf file in the table format:
pd.DataFrame(x).to_hdf('x.hdf', 'key', format = 'table')
Ouput:
TypeError: Cannot serialize the column [unsigned] because its data contents are [integer] object dtype
Can someone explain the type conversion?