1

I am using cudf (dask-cudf) to handle tens~billions of data for social media. I'm trying to use query in extracting only the relevant users from the mother data set.

However, unlike pandas, cudf's query will error if I pass in a list or set.

The environment is anaconda rapids22.12 and cuda is 11.4.

The error is as follows:


TypingError: Failed in cuda mode pipeline (step: nopython frontend)
Internal error at <numba.core.typeinfer.CallConstraint object at 0x7f381a6097f0>.
Failed in cuda mode pipeline (step: native lowering)
Failed in nopython mode pipeline (step: native lowering)
NRT required but not enabled
During: lowering "$6for_iter.1 = iternext(value=$phi6.0)" at /home/user/.pyenv/versions/anaconda3-2020.11/envs/rapids-22.12/lib/python3.8/site-packages/numba/cpython/listobj.py (664)
During: lowering "$6compare_op.2 = src in __CUDF_ENVREF__test" at <string> (2)
During: resolving callee type: type(CUDADispatcher(<function queryexpr_5ee033e5bcab9f09 at 0x7f381b909ee0>))
During: typing of call at <string> (6)

Enable logging at debug level for details.

File "<string>", line 6:
<source missing, REPL/exec in use?>

test code is as follows:

df is a cudf.DataFrame and is a table of edge lists consisting of "src" and "dst" columns

test = list(test_userid)[0:2]
df.query("(src==@test)or(dst==@test)") #ok if one value not list
df.query("src.isin(@test)") #ng
df.query("src in @test") #ng
df.query("src==@test") #ng

It is not essential to use query, so if there is a way to extract other than query, I would like to know that as well.

I have confirmed that the code can successfully extract if it is by pandas. Also, the cudf query works correctly if it is a single value, not a list. I believe that it should work properly even if you pass lists to cudf.

felntc
  • 13
  • 2
  • cuDF doesn't support using nested types like List in `.query`. If you provide a [minimal, reproducible example](https://stackoverflow.com/help/minimal-reproducible-example) and your expected output, the community may be able to suggest an alternative approach. – Nick Becker Jan 19 '23 at 19:41

0 Answers0