Questions tagged [cudf]

Use this tag for questions specifically related to the cuDF Library, or cuDF DataFrame manipulations.

From PyPI: The RAPIDS cuDF library is a GPU DataFrame manipulation library based on Apache Arrow that accelerates loading, filtering, and manipulation of data for model training data preparation. The RAPIDS GPU DataFrame provides a pandas-like API that will be familiar to data scientists, so they can now build GPU-accelerated workflows more easily.

146 questions
0
votes
1 answer

What to use in place of pandas.Series.filter?

pandas -> cuDF Converting some python written for pandas to run on rapids pandas temp=df_train.copy() temp['buildingqualitytypeid']=temp['buildingqualitytypeid'].fillna(-1) temp=temp.groupby("buildingqualitytypeid").filter(lambda x:…
gumdropsteve
  • 70
  • 1
  • 14
0
votes
2 answers

Equivalent of pd.Series.str.slice() and pd.Series.apply() in cuDF

I am wanting to convert the following code (which runs in pandas) to code that runs in cuDF. Sample data from .head() of Series being manipulated is plugged into OG code in the 3rd code cell down -- should be able to copy/paste run. Original code in…
gumdropsteve
  • 70
  • 1
  • 14
0
votes
3 answers

Replace values in Column C where value in Column A is x

Issue In process of replacing null values so column is boolean, we find null values in fireplace_count column. If fireplaceflag value is False the fireplace_count null value should be replaced with 0 written for…
gumdropsteve
  • 70
  • 1
  • 14
0
votes
1 answer

import cudf fail: Illegal instruction (core dumped)

cuda driver installed. cudf installed with conda. I try to import cudf but turns Illegal instruction (core dumped). I also tried uninstall cudf 0.7.2 and install cudf 0.6.1 . no luck.
Michael
  • 539
  • 10
  • 19
0
votes
1 answer

'nvstrings' object has no attribute 'to_gpu_array'

I'm using cuML for stochastic gradient descent. I used sklearn's train_test_split to generate the splits for train_X, train_y ... from a cuDF dataframe. The following code (I removed the hyperparameters which aren't relevant to this question): from…
Sterls
  • 723
  • 12
  • 22
-1
votes
1 answer

python - cuDF : find dulplicates in a list of names using Levenshtein distance. ValueError when using apply function on cuDF DataFrame

I want to find duplicates (or almost duplicates) in the names of a people dataframe and using Levenshtein distance (2 names separated by at most 1 in Levenshtein distance are considered duplicates). This implies to calculate the levenshtein distance…
-1
votes
2 answers

Is there a method to find girvan newman using CuGraph?

I have been using the Girvan-Newman algorithm from networkx to find the modularity of a network with 4039 nodes and 88,234 edges. Due to the nature of the algorithm, it was running overnight, and wouldn't complete. Hence I paid for colab pro and I…
-1
votes
2 answers

how to set up cudf for cuda 9.0

I have Cuda 9.0 version and I have tried pip install cudf==0.6.1 ERROR: Could not find a version that satisfies the requirement cudf==0.6.1 (from versions: none) ERROR: No matching distribution found for cudf==0.6.1
-1
votes
1 answer

.data function in cuDF returning none

I am trying to make some operations using nvstrings but .data is returning None import cudf sents = cudf.read_csv("train.csv", quoting=3, skiprows=1, names=['review', 'label']) gstr = sents['review'].data print(gstr) -> None dataset…
Md Kaish Ansari
  • 251
  • 2
  • 7
-1
votes
1 answer

How to determine RMM Pool usage

When using a rmm pool, is it possible to query how much of the pool is occupied ?
quasiben
  • 1,444
  • 1
  • 11
  • 19
-4
votes
1 answer

Change Pandas code into CUDF for GPU utilization

I am making pairs of images by mixing positive and negative pairs. This process is quite computationally and takes a lot of RAM and processor. To speed up, I want to use GPU and change pandas code into CUDF. Now, the documentation of CUDF is very…
Khawar Islam
  • 2,556
  • 2
  • 34
  • 56
1 2 3
9
10