I got below error when writing dask dataframe to S3. Couldn't figure out why. Does anybody know how to fix.
dd.from_pandas(pred, npartitions=npart).to_parquet(out_path)
The error is
error.. Error converting column "team_nm" to bytes using encoding UTF8. Original error: bad argument type for built-in operation Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/fastparquet/writer.py", line 175, in convert out = array_encode_utf8(data) File "fastparquet/speedups.pyx", line 60, in fastparquet.speedups.array_encode_utf8 TypeError: bad argument type for built-in operation
During handling of the above exception, another exception occurred:
I tried to encode the "team_nm" to "latin-1" before writing to parquet but doesn't work.
pred['team_nm'] = pred['team_nm'].str.encode("Latin-1")
Tried to upgrade fastparquet from 0.4.1 to 0.7.1 but it doesn't work either