Questions tagged [cudf]

Use this tag for questions specifically related to the cuDF Library, or cuDF DataFrame manipulations.

From PyPI: The RAPIDS cuDF library is a GPU DataFrame manipulation library based on Apache Arrow that accelerates loading, filtering, and manipulation of data for model training data preparation. The RAPIDS GPU DataFrame provides a pandas-like API that will be familiar to data scientists, so they can now build GPU-accelerated workflows more easily.

146 questions
0
votes
1 answer

Correctly zipping two columns with different data types in cuDF

I have the following DataFrame in cuDF: Context Questions 0 Architecturally, the school has a Catholic cha... [To whom did the Virgin Mary allegedly…
JOKKINATOR
  • 356
  • 1
  • 11
0
votes
1 answer

Multiply two df in GPU (cudf)

I have two dataframe in GPU. I want to multiply each element of each df. Here is a simple version of my dataframes: import cudf a = cudf.DataFrame() a['c1'] = [1, 2] b = cudf.DataFrame() b['c1'] = [2, 5] I want to see this output: c1 0 2 1 …
Sadcow
  • 680
  • 5
  • 13
0
votes
1 answer

std::bad_alloc: out_of_memory: CUDA error when importing data/running models

I'm trying to upload a dataset to a NVIDA RAPIDS jupyter notebook, but this error keeps popping up when importing this dataset or when using XGBoost on a dask dataframe. The training dataset is 3.7gb in size. I only have one GPU. Some specs: CPU:…
0
votes
1 answer

User Defined Function Compilation Failed When Using .apply() in CUDF

Was trying to use a function on a cudf to create values for a new column in the data frame using .apply() import cudf import numpy as np import pandas as pd import sys sys.version > '3.9.15 | packaged by conda-forge | (main, Nov 22 2022, 08:45:29)…
0
votes
1 answer

How to save the data drawn by cuxfilter as an image

I am considering drawing a network. For example, as shown in the demo, we can use the dashboard to get the chart, but there is no "save button" on the right side, as is often the case. cux_df = cuxfilter.DataFrame.load_graph((nodes, edges)) chart0…
felntc
  • 13
  • 2
0
votes
1 answer

How to pad list column in cuDF dataframe?

I want to implement padding operation to each list after collecting with groupby operation. The conceptual implementation is like this: df = cudf.DataFrame({"g": [1, 1, 1, 2, 2, 3], "a": [1, 2, 3, 1, 3,…
bilzard
  • 13
  • 1
  • 4
0
votes
1 answer

dpkg-deb: error: paste subprocess was killed by signal (Broken pipe) Errors were encountered while processing:

I am participating kaggle OTTO Recommendation System competition. I am trying many other code. I see high ranker use cudf library. so I want to try it, but I face many errors... I will show you from the beginning. please help me... E: Unmet…
0
votes
1 answer

RuntimeError: CUDA error encountered, when using cuml

When I use rapids I always meet errors: Now, I run: from cuml.datasets.regression import make_regression data, values = make_regression(n_samples=200000, n_features=50, n_informative=7, bias=-4.2, …
Chao Li
  • 11
  • 3
0
votes
2 answers

cugraph create NoneType

I tried to create a Graph from a dask_cudf DataFrame, but the Graph get Nonetype without error Message. I tried it with the same data set also with a pandas dataframe. Then I tried it with three sample edges. Each time a NoneType object. However, if…
padul
  • 134
  • 11
0
votes
1 answer

User defined function to combine CUDF dataframe columns

As per the title, I am trying to combine the row values from different cudf.DataFrame columns. The following code works for a standard pandas.DataFrame: import pandas as pd data = {'a': [1], 'b': [2], 'c': [3], 'd': [4]} df =…
epifanio
  • 1,228
  • 1
  • 16
  • 26
0
votes
0 answers

cuDF support for emoji_patterns

Is there a faster way to clean emojis from a cuDF string series? I am currently using emoji == 1.7.0 and retrieving the regex emoji patterns (since cuDF doesnt support the emoji library directly to do a emoji.get_emoji_regexp().sub("", string)…
ZooPanda
  • 331
  • 3
  • 11
0
votes
2 answers

cudf.DataFrame.sort_values - `ValueError: Cannot convert value of type NotImplementedType to cudf scalar`

I get an error when using sort_values on cudf DataFrame (Version: 22.2.0) : >>> import cudf >>> df = cudf.DataFrame() >>> df['a'] = [0, 1, 2] >>> df['b'] = [-3, 2, 0] >>> df.sort_values('b') ValueError: Cannot convert value of type…
Mehdi
  • 124
  • 1
  • 7
0
votes
2 answers

How to groupby with custom function in python cuDF?

I am new to using GPU for data manipulations, and have been struggling to replicate some of the functions in cuDF. For instance, I want to get a mode value for each group in the dataset. In Pandas it is easily done with custom functions: df =…
Dark Hobbit
  • 3
  • 1
  • 3
0
votes
0 answers

Rapids on colab

I have always used following commands to install Rapids on Colab (from https://colab.research.google.com/drive/1rY7Ln6rEE1pOlfSHCYOVaqt8OvDO35J0#forceEdit=true&offline=true&sandboxMode=true) !git clone…
paka
  • 55
  • 7
0
votes
1 answer

label encoding in dask_cudf dataframe

I am trying to use dask_cudf to preprocess a very large dataset (150,000,000+ records) for multi-class xgboost training and am having trouble encoding the class column (dtype is string). I tried using the 'replace' function, but the error message…