We have generated a parquet
file in Dask
(Python) and with Drill
(R using the Sergeant
packet ). We have noticed a few issues:
- The format of the
Dask
(i.e.fastparquet
) has a_metadata
and a_common_metadata
files while theparquet
file inR \ Drill
does not have these files and haveparquet.crc
files instead (which can be deleted). what is the difference between theseparquet
implementations?