I have a bunch of json files with a list of dict in them:
[
{'a': 5.523275199811905e-05, 'b': 0.0016375015413388609, 'c': -0.0166875154286623, 'd': -0.06936456641533968, 'e': -0.06665282790239256, 'f': -0.13665749109868952, 'g': 1670519207414, 'h': 1670519204046, 'y': ''}
{'a': 5.523275199811905e-05, 'b': 0.0016375015413388609, 'c': -0.0166875154286623, 'd': -0.06936456641533968, 'e': -0.06665282790239256, 'f': -0.13665749109868952, 'g': 1670519207414, 'h': 1670519204046, 'y': ''}
]
Splitting the dict up and save them each to a separate csv file, was successful but I need to have parquet files - each for a dataset. Is there a way to solve this without spark?
I received a parquet file with partitioning but this is not what i need actually.
table_from_pandas = pa.Table.from_pandas(df)
pq.write_to_dataset(table_from_pandas,
root_path='ddp_final/handytracking.parquet',
partition_cols=['timestamp']
)