Reading the petastorm/etl/dataset_metadata.py script I found this code
if row_groups_key != ".":
for row_group in range(row_groups_per_file[row_groups_key]):
rowgroups.append(pq.ParquetDatasetPiece(
piece.path,
open_file_func=dataset.fs.open,
row_group=row_group,
partition_keys=piece.partition_keys
))
where pq is defined like:
from pyarrow import parquet as pq
I've searched everywhere for the ParquetDatasetPiece class and can't find it. Somebody can tell me where is the ParquetDatasetPiece class?