I'm trying to turn this pandas to dask to accelerate this loop who will work on 40 millions datas (any better tricks to go faster is welcome!)
the pandas version work well but the Dask one have an error. I'm first time sussing Dask so I don't have the "feeling" of how make it work.
Pandas original code:
for bv in df_2_transform.index.unique():
# everywhere row with index==bv & date==august write 100
df.loc[bv and (pd.to_datetime(df['date']).dt.month == 8), v_n] = 100
my Dask attempt:
for bv in df_2_transform.index.unique():
df_receveur[v_n] =
df[v_n].mask(bv and (dd.to_datetime(df['date']).dt.month == 8), 100)
where : v_n = name of a column I got theses errors messages:
ValueError: Metadata inference failed in `mask`.
You have supplied a custom function and Dask is unable to
determine the type of output that that function returns.
To resolve this please provide a meta= keyword.
The docstring of the Dask function you ran should have more information.
Original error is below:
------------------------
ValueError('Must specify axis=0 or 1')
Thank for your help