Questions tagged [rapids]

RAPIDS is a framework for accelerated machine learning and data science on GPUs

Questions pertaining to RAPIDS. From https://rapids.ai/ :

The RAPIDS suite of open source software libraries gives you the freedom to execute end-to-end data science and analytics pipelines entirely on GPUs. It relies on NVIDIA® CUDA® primitives for low-level compute optimization, but exposes that GPU parallelism and high-bandwidth memory speed through user-friendly Python interfaces.

RAPIDS also focuses on common data preparation tasks for analytics and data science. This includes a familiar DataFrame API that integrates with a variety of machine learning algorithms for end-to-end pipeline accelerations without paying typical serialization costs. RAPIDS also includes support for multi-node, multi-GPU deployments, enabling vastly accelerated processing and training on much larger dataset sizes.

195 questions
1
vote
1 answer

conda install rapids fail date to package

I tried installing rapids library on a conda environment but getting the following error conda install -c rapidsai -c nvidia -c conda-forge rapids=21.10.00 PackagesNotFoundError: The following packages are not available from current channels: …
1
vote
1 answer

cuxfilter.dashboard.DashBoard.preview keeps throwing "NameError: name 'launch' is not defined" at me

TLDR: await d.preview() # throws "NameError: name 'launch' is not defined" # d.preview() # does not throw "NameError: name 'launch' is not defined"; however, it still does not produce the desired image in a jupyter notebook... It's __repr__…
Tim Tyree
  • 21
  • 3
1
vote
1 answer

cuPy error : Implicit conversion to a host NumPy array via __array__ is not allowed,

Getting this error while converting array to cuPy array: TypeError: Implicit conversion to a host NumPy array via array is not allowed, To explicitly construct a GPU array, consider using cupy.asarray(...) To explicitly construct a host array,…
heisenberg_88
  • 13
  • 1
  • 4
1
vote
1 answer

Invalid order in Rapidsai cuml ARIMA

I'm trying to find right parameters for ARIMA but not able to use parameters higher than 4. Here is the code. from cuml.tsa.arima import ARIMA p = 5 q = 0 P = 1 Q = 0 model = ARIMA(train, order=(p,0,q), seasonal_order=(P,0,Q,24),…
shadow5893
  • 337
  • 4
  • 10
1
vote
3 answers

Spark RAPIDS - Operation not replaced with GPU version

I am new to Rapids and I have trouble understanding the supported operations. I have data in following format: +------------+----------+ | kmer|source_seq| +------------+----------+ |TGTCGGTTTAA$| 4| |ACCACCACCAC$| …
1
vote
0 answers

Pandas TypeError when using cudf dataframe, but not pandas

I don't think I'm trying to solve this as much as understand what's going on so I can apply it in the context of my larger project. I am working on rewriting a Python package to run on GPU. Anyway, I am using cudf and cuml to pass a dataframe to a…
datahappy
  • 826
  • 2
  • 11
  • 29
1
vote
1 answer

Plotting with rapids cuGraph

Am quite a latecomer to RAPIDS API. My question is, does the cuGraph package help in plotting similar graphs as those we do with seaborn and matplotlib eg histograms and barcharts? I have searched everywhere in the internet but I cant get close to…
user322203
  • 101
  • 7
1
vote
2 answers

'cupy.core.core.ndarray' object has no attribute 'unique'

I was transforming categorical features using factorize() function which returns a tuple of a cupy array and strings. I assigned the cupy array into a variable named codes. However, I can't seem to get the unique values of codes using…
radiozz
  • 13
  • 1
  • 4
1
vote
0 answers

Can I split physical GPUs into multiple Logical/Virtual GPUS and pass them to dask_cuda.LocalCUDACluster?

I have a workflow which is greatly benefited from GPU acceleration, but each task has relatively low memory requirements (2-4 GB). I'm using a combination of dask.dataframe, dask.distributed.Client, and dask_cuda.LocalCUDACluster. The process would…
1
vote
1 answer

Does JETSON NANO support RAPIDS?

Is it possible to run data science tools like RAPIDS on a JETSON NANO? After some searching, I am still not very clear... also, if it does, will data analysis run faster on it than on a CPU? Any insights will be appreciated. Thanks.
Bo Qiang
  • 739
  • 2
  • 13
  • 34
1
vote
1 answer

Why am I getting an assertion error when create Device Quantile Matrix?

I am using the following code to load a csv file into a dask cudf, and then creating a devicequantilematrix for xgboost which yields the error: cluster = LocalCUDACluster(rmm_pool_size=parse_bytes("9GB"), n_workers=5, threads_per_worker=1) client =…
lara_toff
  • 413
  • 2
  • 14
1
vote
1 answer

How do I install dask_cudf?

I am using the follow lines in terminal to install rapids and then dask cudf: conda create -n rapids-core-0.14 -c rapidsai -c nvidia -c conda-forge \ -c defaults rapids=0.14 python=3.7 cudatoolkit=10.1 conda activate rapids-core-0.14 conda…
lara_toff
  • 413
  • 2
  • 14
1
vote
1 answer

How to access Spark DataFrame data in GPU from ML Libraries such as PyTorch or Tensorflow

Currently I am studying the usage of Apache Spark 3.0 with Rapids GPU Acceleration. In the official spark-rapids docs I came across this page which states: There are cases where you may want to get access to the raw data on the GPU, preferably…
deepNdope
  • 179
  • 3
  • 14
1
vote
1 answer

Why is cuml predict() method for KNearestNeighbors taking so long with dask_cudf DataFrame?

I have a large dataset (around 80 million rows) and I am training a KNearestNeighbors Regression model using cuml with a dask_cudf DataFrame. I am using 4 GPU's with an rmm_pool_size of 15GB each: from dask.distributed import Client from dask_cuda…
agp
  • 31
  • 6
1
vote
1 answer

Calculating haversine distances on groups using cudf and cuspatial

I am trying to use accelerated (GPU backed) computing for distance calculations, but have had a lot of trouble with the nuances between pandas and cudf. I have a df with vehicles and points in time (lat,lng,timestamp), my cpu based calculation was…