Questions tagged [rapids]

RAPIDS is a framework for accelerated machine learning and data science on GPUs

Questions pertaining to RAPIDS. From https://rapids.ai/ :

The RAPIDS suite of open source software libraries gives you the freedom to execute end-to-end data science and analytics pipelines entirely on GPUs. It relies on NVIDIA® CUDA® primitives for low-level compute optimization, but exposes that GPU parallelism and high-bandwidth memory speed through user-friendly Python interfaces.

RAPIDS also focuses on common data preparation tasks for analytics and data science. This includes a familiar DataFrame API that integrates with a variety of machine learning algorithms for end-to-end pipeline accelerations without paying typical serialization costs. RAPIDS also includes support for multi-node, multi-GPU deployments, enabling vastly accelerated processing and training on much larger dataset sizes.

195 questions

votes

1 answer

Convert cuml (RAPIDS) truncatedSVD into sklearn

I have to convert a code written using cuml (RAPIDS) into sklearn. I found out that in cuml.truncatedSVD the parameter n_components which is the output dimensions (number of singular values) can equal to the number of inputs/features in cuml, but…

asked Jun 21 '21 at 08:28

FiReTiTi

5,597
12
30
58

votes

1 answer

Gaps in nvvp timeline when running rapids with spark

I'm running some sql query against a CSV, generated with tpch-dbgen. I am running it with one thread/task for simplicity, and see the gaps in the timeline as shown in the attached image. Is it disk operations? can this overhead be somehow relaxed or…

rapids cudf

asked Jun 20 '21 at 19:25

Eyal Hirsch

votes

1 answer

Out of memory error with Dask and cudf loop

I am using Dask and Rapidsai to run an xgboost model on a large (6.9GB) dataset. The hardware is 4x 2080 TIs with 11 GB of memory each. The raw dataset has a few dozen target columns that have been one-hot encoded, so I am trying to run a loop that…

python dask rapids cudf dask-ml

asked Jun 08 '21 at 16:01

datahappy

votes

1 answer

cuML RandomForestClassifier: CUDA error with documentation example

I am trying to run in Jupyter notebook the example found here and copied below from the rapids cuML introduction on classification - it runs well with n_samples under 6000 (this parameter dictates the number of rows of the generated dataset) import…

python jupyter-notebook cuda rapids

asked Jun 03 '21 at 16:47

Oleg

votes

1 answer

cuGraph on Multi-GPU

Recently, I am reading the code of cuGraph. I notice that it is mentioned that Louvain and Katz algorithms support multi-GPU. However, when I read the C++ code of Louvain, I cannot find code that is related to multi-GPU. Specifically, according to a…

cuda multi-gpu rapids

asked May 27 '21 at 15:56

Sevaro

votes

1 answer

Error while running compare models in Pycaret 2.2 on Rapids 0.19 environment(CONDA)

I am facing this issue for RandomForestRegressor while comparing models.My Pycaret version is Pycaret2.2 and it is running in Rapids-0.19 Environment.enter image description here.

rapids pycaret

asked May 21 '21 at 20:05

samael247

votes

2 answers

TypeError: melt() takes 1 positional argument but 2 were given

I am trying to use melt() function but it is showing me an error for passing 2 argument, which really weird because i am passing id as an argument and in my DataFrame i have only one id column, Although this error only comes when i use data which…

python pandas dataframe rapids

asked May 05 '21 at 22:58

Sudhanshu

votes

1 answer

AttributeError: 'cupy.core.core.ndarray' object has no attribute 'iloc'

i am trying to split data into training and validation data, for this i am using train_test_split from cuml.preprocessing.model_selection module. but got an…

python machine-learning rapids cudf

asked May 03 '21 at 14:03

Sudhanshu

votes

1 answer

CUML fit functions throwing cp.full TypeError

I've been trying to run RAPIDS on Google Colab pro, and have successfully installed the cuml and cudf packages, however I am unable to run even the example scripts. TLDR; Anytime I try to run the fit function for cuml on Google Colab I get the…

python google-colaboratory rapids

asked May 03 '21 at 12:12

Glen Moutrie

votes

1 answer

Unable to load and compute dask_cudf dataframe into blazing table and seeing some memory related errors. (cudaErrorMemoryAllocation out of memory)

Issue : Trying to load a file (CSV and Parquet) using Dask CUDF and seeing some memory related errors. The dataset can easily fit into memory and the file can be read correctly using BlazingSQL's read_parquet method. However the…

python memory dask dask-distributed rapids

asked Apr 29 '21 at 05:15

chaitanyac3

votes

0 answers

CUML: Random Forest Model Can't Be Trained on a Multi GPU Dask Cluster

Based on the official distributed model training example (https://github.com/rapidsai/cuml/blob/branch-0.18/notebooks/random_forest_mnmg_demo.ipynb), I used the Iris dataset to train a random forest model on a multi GPU dask cluster (one scheduler…

python random-forest dask-distributed rapids

asked Apr 20 '21 at 08:28

nomad

votes

1 answer

RAPIDS: How to use one dataframe in a UDF called with apply_rows of another dataframe?

For each row in dataframe A, I need to query DF B. I need to do something like this: filter B rows by values in column b1 (B.b1) which are in a range defined by columns A.a1 and A.a2 and assign combined values to column A.a3. In pandas that would be…

python pandas rapids cudf

asked Apr 19 '21 at 03:04

Peter

votes

1 answer

cuDF: an alternative of Pandas Groupby + Shift?

I have a DF that I want to use Groupby + Shift. I can do this in pandas, but I cannot do it in cuDF because it is not implemented yet: see the issue Issue #7183. The feature request was long ago, so it seems like they will not implement this in the…

pandas rapids cudf

asked Mar 30 '21 at 02:23

Minh-Long Luu

2,393
1
17
39

votes

1 answer

How to rotate X-axis labels in bokeh figure in Cuxfilter?

I have the exact same issue as this question, except the implementation within cuxfilter (RAPIDS) cux_df = cuxfilter.DataFrame.from_dataframe(test) chart0 = cuxfilter.charts.bar('index', 'count') chart0.xaxis.major_label_orientation =…

python bokeh rapids

asked Mar 20 '21 at 05:45

lys

votes

1 answer

Dask-Rapids data movment and out of memory issue

I am using dask (2021.3.0) and rapids(0.18) in my project. In this, I am performing preprocessing task on the CPU, and later the preprocessed data is transferred to GPU for K-means clustering. But in this process, I am getting the following…

dask-distributed cupy rapids dask-ml

asked Mar 19 '21 at 08:29

Vivek kala

Prev 1 2 3

…

12 13 Next