A Python interface to the Parquet file format.
Questions tagged [fastparquet]
141 questions
0
votes
1 answer
Getting a dataframe from a pandas groupby to write to parquet
I have some csv data with the following columns:
country, region, year, month, price, volume
I need to transform this to something like the following:
country, region, datapoints
Where datapoints consists of either:
(year, month, price, volume)
…

ashic
- 6,367
- 5
- 33
- 54
0
votes
1 answer
UnicodeEncodeError When Attempting to Print Pandas DataFrame Created With Query in Python 3
I have searched and searched. I can't exactly find an issue quite like mine. I did try.
I have read Parquet data into a Pandas dataframe and used .query statement to filter data.
import pandas as pd
import fastparquet as fp
fieldsToInclude =…
user1998239
0
votes
1 answer
converting ParquetFile to pandas Dataframe with a column with a set of string in python
I have a parquet file which has a simple file schema with a few columns. I read it into python using the code below
from fastparquet import ParquetFile
pf = ParquetFile('inout_files.parquet')
This runs fine, but when I convert it into pandas using…

Reyhaneh
- 409
- 1
- 7
- 21
0
votes
1 answer
date can not be serialized
I am getting an error while trying to save the dataframe as a file.
from fastparquet import write
write('profile_dtl.parq', df)
The error is related to "date" and the error message looks like this...
ValueError: Can't infer object conversion type:…

shantanuo
- 31,689
- 78
- 245
- 403
-2
votes
2 answers
How to efficiently join multiple dask dataframes
I have 33 multi-partition dataframes. All have their metadata. They were all made with fastparquet. The structure looks something like:
- 20190101.parquet
- _common_metadata
- _metadata
- part.0.parquet
- ....
- part.n.parquet
-…

birdsarah
- 1,165
- 8
- 20
-4
votes
1 answer
Divide parquet file on subfiles using fastparquet
I need to convert a csv file to Parquet format. But this csv file is very huge (more than 65 000 rows and 1 000 columns), that's why I need to divide my parquet file into several subfiles by 5 000 rows and 200 columns in each one). I have already…

Maria
- 69
- 1
- 10