I have a Dask DataFrame of following format:
date hour device param value
20190701 21 dev_01 att_1 0.000000
20190718 22 dev_01 att_2 20.000000
20190718 22 dev_01 att_3 18.611111
20190701 21 dev_01 att_4 18.706083
20190718 22 dev_01 att_5 23.333333
I am trying to pivot the dataframe using Dask.DataFrames.pivot_table() API. However, I want to use 'date', 'hour' and 'device' as the index (i.e, in the pivoted table each row would be uniquely identified by the date, hour and device identifier):
ddf.pivot_table(index = ['date', 'hour', 'device'], columns='param', values='value')
However, it's failing with the following error:
'index' must be the name of an existing column
As I understand from the API documentation (here), the parameter 'index' accepts name of a single column (and not a list) and hence this error.
Is there any other alternative of pivoting a dask dataframe using multiple columns as index?