Context
I have partitioned Parquet files in S3. I want to read and concatenate them into a DataFrame so I can query and view the data (in memory). I did it so far, however one of the columns's data with the type (array<array< double >>) is converted to None. Other columns (such as str, array of int, etc) are converted correctly. I am not sure what I am missing in the process. I imagine the data is missed during this conversion, or the data is there and my querying method is wrong.
Steps I did so far
import s3fs
import fastparquet as fp
import pandas as pd
key = 'MyAWSKey'
secret = 'MyAWSSecret'
token = 'MyAWSToken'
s3_file_system = s3fs.S3FileSystem(secret= secret, token=token, key=key)
file_names = s3_file_system.glob(path='s3://.../*.snappy.parquet')
# <class 'fastparquet.api.ParquetFile'>
fp_api_parquetfile_obj = fp.ParquetFile(files, open_with= s3_file_system.open)
data = fp_api_parquetfile_obj.to_pandas()
Query Result
# column A type is array of array of doubles
print(pd.Series(data['A']).head(10))
# Prints 10 rows of None! [Incorrect]
# column B type is array of int
print(pd.Series(data['B']).head(10))
# Prints 10 rows of array of int values correctly
# column C type is string
print(pd.Series(data['C']).head(10))
# Prints 10 rows of str values correctly
Please note that the data (array of array of doubles) exist in the files, because I can query it using Athena.