I am trying to use the describe() and unstack()
function in dask to get the summary statistics of the data.
However, i get an error as shown below
import dask.dataframe as dd
df = dd.read_csv('Measurement_table.csv',assume_missing=True)
df.describe().compute() #this works but when I try to use `unstack`, i get an error
Actually I am trying to make the below python pandas code to work faster with the help of dask
df.groupby(['person_id','measurement_concept_id','visit_occurrence_id'])['value_as_number']
.describe()
.unstack()
.swaplevel(0,1,axis=1)
.reindex(df['readings'].unique(), axis=1, level=0)
I tried adding compute()
to each output stage as shown below
df1 = df.groupby(['person_id','measurement_concept_id','visit_occurrence_id'])['value_as_number'].describe().unstack().swaplevel(0,1,axis=1).reindex(df['readings'].unique(), axis=1, level=0).compute()
I get the below error but the same works well in pandas
Can anyone help me fix this issue?