Questions tagged [rapids]

RAPIDS is a framework for accelerated machine learning and data science on GPUs

Questions pertaining to RAPIDS. From https://rapids.ai/ :

The RAPIDS suite of open source software libraries gives you the freedom to execute end-to-end data science and analytics pipelines entirely on GPUs. It relies on NVIDIA® CUDA® primitives for low-level compute optimization, but exposes that GPU parallelism and high-bandwidth memory speed through user-friendly Python interfaces.

RAPIDS also focuses on common data preparation tasks for analytics and data science. This includes a familiar DataFrame API that integrates with a variety of machine learning algorithms for end-to-end pipeline accelerations without paying typical serialization costs. RAPIDS also includes support for multi-node, multi-GPU deployments, enabling vastly accelerated processing and training on much larger dataset sizes.

195 questions
1
vote
1 answer

RuntimeError: Cluster failed to start with dask LocalCudaCluster example setup

I am new to Dask and I run into problems when executing the example code: from dask.distributed import Client from dask_cuda import LocalCUDACluster cluster = LocalCUDACluster() client = Client(cluster) I would get the following…
1
vote
0 answers

How to run query with lists and sets in cuDF

I am using cudf (dask-cudf) to handle tens~billions of data for social media. I'm trying to use query in extracting only the relevant users from the mother data set. However, unlike pandas, cudf's query will error if I pass in a list or set. The…
felntc
  • 13
  • 2
1
vote
2 answers

conda error on install for RAPIDS fails due to incompatible glib

OS: Linux 4.18.0-193.28.1.el8_2.x86_64 anaconda: anaconda3/2022.10 Trying to install RAPIDS, I get: $ conda install -c rapidsai rapids Collecting package metadata (current_repodata.json): done Solving environment: failed with initial frozen…
Mark Bower
  • 569
  • 2
  • 16
1
vote
1 answer

'Cannot convert value of type NotImplementedType to cudf scalar' appearing on trivial sort_values example in cudf 22.08, python 3.9

Apologies - I'm aware there's a similar question, however I'm new to SO, so I'm unable to comment underneath the answer. I'm having issues with sort_values in a vanilla install of cudf as per the RAPIDs website: conda create -n rapids-22.08 -c…
dcgt1
  • 33
  • 3
1
vote
1 answer

CUDF not reading columns properly

I'm trying to read a csv with cudf. It work nicely but when I try to get the content of the columns, it seems that cudf is not recognizing them at all. It's a very odd behavior : Here is the code : And here is the error : any help please? thanks
1
vote
1 answer

Dask-cuDF to CuDF dataframe conversion

Is there any function, that convert Dask-cudf dataframe to Cudf dataframe?Like from_cudf for cudf to dask-cudf. dgdf = dask_cudf.from_cudf(df, npartitions=2)
1
vote
1 answer

scala rapids using an opaque UDF for a single column dataframe that produces another column

I am trying to acquaint myself with RAPIDS Accelerator-based computation using Spark (3.3) with Scala. The primary contention in being able to use GPU appears to arise from the blackbox nature of UDFs. An automatic solution would be the Scala UDF…
Quiescent
  • 1,088
  • 7
  • 18
1
vote
0 answers

Solving environment: failed with repodata from current_repodata.json occurring while installing RapidsAI on Ubuntu18.04 WSL2

I try to install RapidsAI in Ubuntu18.04 in WSL2 using conda but the installation always stuck after repodata message Solving environment: failed with repodata from current_repodata.json This was unexpected. abhipraja@abhipraja:~$ conda create -n…
abhipraja
  • 11
  • 3
1
vote
1 answer

Cannot create 3rd lagged columns with dask-cudf

I have the following dask_cudf.core.DataFrame:- import pandas as pd import numpy as np import dask_cudf import cudf data = {"x":range(1,21), "nor":np.random.normal(2, 4, 20), "unif":np.random.uniform(size = 20)} df = cudf.DataFrame(data) ddf =…
Shawn Brar
  • 1,346
  • 3
  • 17
1
vote
2 answers

List operation with CUDF dataframe

I have a Cudf dataframe which looks like this The dtype of columns POSITION_ANTENNA1 and POSITION_ANTENNA2 are lists, and I want to construct a column = POSITION_ANTENNA1 - POSITION_ANTENNA2. However, it is giving me an error Lists concatenation…
Arpan Das
  • 321
  • 1
  • 3
  • 9
1
vote
1 answer

TypeError: First element of field tuple is neither a tuple nor str, with cuDF.DataFrame.apply(func,axis)

I am trying to apply histogram row-wise using the apply function but getting an error. Below code is the implementation def f(row): return np.histogram(row, bins=5,range=(1,10)) import torch import cudf as df torch.manual_seed(1) bins =…
ammar naich
  • 73
  • 1
  • 4
1
vote
0 answers

RAPIDS cuml KNeighbors: number of landmark samples must be >= k

Minimum reproducible example: import cudf from cuml.neighbors import KNeighborsRegressor d = { 'id':['a','b','c','d','e','f'], 'latitude':[50,-22,13,37,43,14], 'longitude':[3,-43,100,27,-4,121], } df = cudf.DataFrame(d) knn =…
pjmathematician
  • 125
  • 1
  • 5
1
vote
2 answers

Join values from a DataFrame according to an array of indices

I have a DataFrame test with shape (1138812, 57). The head looks like this: And I have an array indices which has a shape (1138812, 25). It is a 2D array with each subarray having 25 indices. It looks like this: [ the indices array has 25 indices…
pjmathematician
  • 125
  • 1
  • 5
1
vote
2 answers

cudf instllation issue on centos7

I'm new to rapids ai libraries. I've an existing conda environment yaml file where I'm using python 3.8.5, tensorflow 2.7.0, opencv-python-headless 4.5.5.62, numpy 1.22.2, pandas 1.4.1, pandas-profiling 3.1.0, seaborn 0.11.2, matplotlib 3.5.1,…
soumeng78
  • 600
  • 7
  • 12
1
vote
0 answers

CUDA memory error calculating shap values although enough memory

I am trying to calculate SHAP Values from a previously trained Random Forest. I am getting the following error: MemoryError: std::bad_alloc: CUDA error at: /opt/anaconda3/envs/rapids-21.12/include/rmm/mr/device/cuda_memory_resource.hpp The Code I…