Im getting this error when transforming a pandas.DF to parquet using pyArrow:
ArrowInvalid('Error converting from Python objects to Int64: Got Python object of type str but can only handle these types: integer
To find out which column is the problem I made a new df in a for loop, first with the first column and for each loop adding another column. I realized that the error is in a column of dtype: object
that starts with 0s, I guess that's why pyArrow wants to convert the column to int
but fails because other values are UUID
Im trying to pass a schema: (not sure if this is the way to go)
table = pa.Table.from_pandas(df, schema=schema, preserve_index=False)
where schema is: df.dtypes