Questions tagged [fastparquet]

A Python interface to the Parquet file format.

Resources:

141 questions
0
votes
1 answer

Getting a dataframe from a pandas groupby to write to parquet

I have some csv data with the following columns: country, region, year, month, price, volume I need to transform this to something like the following: country, region, datapoints Where datapoints consists of either: (year, month, price, volume) …
ashic
  • 6,367
  • 5
  • 33
  • 54
0
votes
1 answer

UnicodeEncodeError When Attempting to Print Pandas DataFrame Created With Query in Python 3

I have searched and searched. I can't exactly find an issue quite like mine. I did try. I have read Parquet data into a Pandas dataframe and used .query statement to filter data. import pandas as pd import fastparquet as fp fieldsToInclude =…
user1998239
0
votes
1 answer

converting ParquetFile to pandas Dataframe with a column with a set of string in python

I have a parquet file which has a simple file schema with a few columns. I read it into python using the code below from fastparquet import ParquetFile pf = ParquetFile('inout_files.parquet') This runs fine, but when I convert it into pandas using…
Reyhaneh
  • 409
  • 1
  • 7
  • 21
0
votes
1 answer

date can not be serialized

I am getting an error while trying to save the dataframe as a file. from fastparquet import write write('profile_dtl.parq', df) The error is related to "date" and the error message looks like this... ValueError: Can't infer object conversion type:…
shantanuo
  • 31,689
  • 78
  • 245
  • 403
-2
votes
2 answers

How to efficiently join multiple dask dataframes

I have 33 multi-partition dataframes. All have their metadata. They were all made with fastparquet. The structure looks something like: - 20190101.parquet - _common_metadata - _metadata - part.0.parquet - .... - part.n.parquet -…
birdsarah
  • 1,165
  • 8
  • 20
-4
votes
1 answer

Divide parquet file on subfiles using fastparquet

I need to convert a csv file to Parquet format. But this csv file is very huge (more than 65 000 rows and 1 000 columns), that's why I need to divide my parquet file into several subfiles by 5 000 rows and 200 columns in each one). I have already…
Maria
  • 69
  • 1
  • 10
1 2 3
9
10