Python cuDF cannot use cuDF dataframe function inside UDF

Question

I am trying to use cuDF row_apply to calculate a new column according to other rows.

For a single row, it works well with the following script filteredhlcdf.loc[(filteredhlcdf.ddate == %%Ddate%%) & (filteredhlcdf.sstart == %%sTime%%) & (filteredhlcdf.ttime <= %%etime%%) & (filteredhlcdf.H > %%ustep%%) , "ttime" ].min() where the variable between %% are substituted by constants.

However, when I try to use row apply as follows, it generates an error

crossappliedcdf = cudf.from_pandas(crossapplieddf)
filteredhlcdf = cudf.from_pandas(filteredhldf)

def rowcal(ddate: int, stime: int, etime: int, usteps: float, dsteps: float, ctime: int, creturn: float):
#def rowcal(ddate: float, stime: float, etime: float, usteps: float, dsteps: float, ctime: int, creturn: float, kwarg1: int):

    for i, (tddate, tstime, tetime, tusteps, tdsteps) in enumerate(zip(ddate, stime, etime, usteps, dsteps)):
        
        ctime[i] = filteredhlcdf.loc[(filteredhlcdf.ddate == tddate) & (filteredhlcdf.sstart == tstime) & (filteredhlcdf.ttime <= tetime) & (filteredhlcdf.H > tusteps) , "ttime" ].min()
        creturn[i] =  tddate +  tstime + tetime+ tdsteps

crossappliedcdf.apply_rows(rowcal, incols=['ddate', 'stime', 'etime','usteps','dsteps'], outcols=dict(ctime=np.int32, creturn=float), kwargs={})

I am sure the mistake occurs in the following line

ctime[i] = filteredhlcdf.loc[(filteredhlcdf.ddate == tddate) & (filteredhlcdf.sstart == tstime) & (filteredhlcdf.ttime <= tetime) & (filteredhlcdf.H > tusteps) , "ttime" ].min()

because the error disappeared after it is replaced by ctime[i] = tddate + tstime + tetime+ tdsteps

Your function is being JIT compiled by Numba to run on the GPU. `Numba.cuda` provides a finite grammar of supported operators; you can't use DataFrame operators like `.loc` inside a numba kernel. If you provide a [minimal, reproducible example](https://stackoverflow.com/help/minimal-reproducible-example), someone from the community may be able to assist you. cuDF's [Guide to UDFs](https://docs.rapids.ai/api/cudf/nightly/user_guide/guide-to-udfs.html) may be of interest to you. — Nick Becker, Sep 14 '21 at 21:54

Python cuDF cannot use cuDF dataframe function inside UDF

0 Answers0