0

I am trying to write a cuDF-UDF which computes the pearson auto correlation with lag==1 of a cuDF series.

I have defined the following UDF:

import cupy as cp
def cuda_corr(x):
    xx=x[:-1]
    yy=x[1:]
    coef=cp.corrcoef(xx,y=yy, rowvar=False)
    return coef[0,1]

And then taking a series and apply the rolling window to the function.

cdf=cudf.from_pandas(df['ex_col'])
cdf.rolling(window=3, min_periods=3, center=False).apply(cuda_corr)

Then I am facing the error:

LoweringError: Failed in nopython mode pipeline (step: nopython mode backend)
Unknown attribute 'corrcoef' of type Module(<module 'cupy' from '/home/idanre1/miniconda3/envs/rapids-21.10/lib/python3.8/site-packages/cupy/__init__.py'>)

Following pandas code is working:

autocorr_window = 3
lag=1
x=df['ex_col']
acorr=x.rolling(
                       window=autocorr_window, 
                       min_periods=autocorr_window,
                       center=False).apply(lambda x: x.autocorr(lag=lag), raw=False)

I am using rapids-21.10 on python 3.8.12

  • cuDF's UDF interface relies on Numba.cuda to compile the function to run on the GPU. The supported functions do not include CuPy functions such as `corrcoef`, which is why you get an error indicating that Numba failed during lowering. cuDF has an open feature request for [Series.autocorr](https://github.com/rapidsai/cudf/issues/9635), but this won't provide you with a rolling version. There isn't a simple and efficient way to do rolling window autocorrelation with cuDF at the moment. You'd need to write a Numba.cuda kernel to calculate the correlation and use that with `rolling.apply` – Nick Becker Nov 10 '21 at 13:57
  • I guess CuPy corrcoef has some source code for a reference I can look at, for starting coding it on my own? Maybe even some kernel I can borrow? – Idan Regev Nov 11 '21 at 06:08

0 Answers0