I'm working on an app that is writing parquet files. For testing purposes, I'm trying to read a generated file with pd.read_parquet. I get a really strange error that asks for a schema:
self = <[AttributeError("'ParquetFile' object has no attribute '_schema'") raised in repr()] ParquetFile object at 0x7fae6e06b250>
This happen on the following line:
data = pd.read_parquet(file)
where file is the path to file from root content. First I'm not supposed to provide a schema as we're talking about parquet here and I'm not sure what could cause the issue. Maybe a readability clause ?
The generated file looks good when I imported it in my Parquet plugin for pycharm
{"Id": 12345, "Limit": 200, "Product": 818} {"Id": 67890, "Limit":3000, "Product": 819} So it shouldn't be an issue with the input data.
NB: Tried the same with fastparquet and got the same error (makes sense as pd.read_parquer is based on it.