Questions tagged [cudf]

Use this tag for questions specifically related to the cuDF Library, or cuDF DataFrame manipulations.

From PyPI: The RAPIDS cuDF library is a GPU DataFrame manipulation library based on Apache Arrow that accelerates loading, filtering, and manipulation of data for model training data preparation. The RAPIDS GPU DataFrame provides a pandas-like API that will be familiar to data scientists, so they can now build GPU-accelerated workflows more easily.

146 questions
0
votes
1 answer

TypingError in rapids cudf User Defined Function

I have a cudf df with Close and Date columns, where Close is float64 and Date is (%Y-%m-%d) datetime64. I wanted to define a function that takes those columns as inputs and creates what is known as Market Profile, as Data is granular, in same Date…
jack
  • 13
  • 3
0
votes
1 answer

error installing cuDF using coda with py3.9

I just created a new env using minicoda with py3.9 and cuda. While trying to install cudf with: conda install -c rapidsai cudf i get the following error message. Output in format: Requested package -> Available versionsThe following specifications…
ManOnTheMoon
  • 587
  • 2
  • 11
0
votes
0 answers

'sub' operator not supported Dask_cudf

I came here due a question that surged while I'm following the tutorial's methodology https://docs.rapids.ai/api/cudf/nightly/user_guide/10min.html. I have a dataframe imported as csv with the following structure: x_tick.head() LocalTime Ask…
jack
  • 13
  • 3
0
votes
1 answer

Apply ta_py function to Cudf dataframe - RAPIDS

trying to create a new column on a cudf dataframe based on VWMA from ta_py : #creating df CJ_m30 = cudf.read_csv("/media/f333a/Data/CJ_m30.csv", names = ["DateTime","Bid","Ask","Open", "High", "Low", "Close"]) #trying to…
zack
  • 1
  • 2
0
votes
1 answer

GPU runs out of memory when training a ml model

I am trying to train a ml model using dask. I am training on my local machine with 1 GPU. My GPU has 24 GiBs of memory. from dask_cuda import LocalCUDACluster from dask.distributed import Client, LocalCluster import dask.dataframe as dd import…
Jonathon Hill
  • 1,007
  • 4
  • 16
  • 23
0
votes
1 answer

How to process data larger than GPU Memory using BlazingSQL

I am trying to run a sql query with a 50 GB CSV file but my GPU Memory is of only 40GB. How can I do the processing? Also I am only able to run blazingsql with the jupyter notebook available with their docker image, can anyone please help me how to…
0
votes
1 answer

Dask-cudf with single GPU

I am trying to read 12 GB CSV file. If I am trying to read with CUDF it is giving a memory error MemoryError: std::bad_alloc: CUDA error at: /usr/local/envs/bsql/include/rmm/mr/device/cuda_memory_resource.hpp:69: cudaErrorMemoryAllocation out of…
0
votes
0 answers

TypeError: First element of field tuple is neither a tuple nor str

I am a beginner to RAPIDS. I am trying to run the following code on Colab. It is resulting in an error. TypeError: First element of field tuple is neither a tuple nor str Similari code runs well while using pandas. But failing while using cudf…
T. Hanuman
  • 11
  • 1
0
votes
1 answer

How to install cuDF on google colab with GPU Tesla K80?

I am trying to install cuDF on Google Colab for hours. One of the requirements I should install cuDF with GPU Tesla T4. While google colab gives me every time GPU Tesla K80 and I cannot install cuDF. I tried this snippet of code to check what type…
Hamzah
  • 8,175
  • 3
  • 19
  • 43
0
votes
1 answer

How To Pass cuDF Dataframe to cuML.ensemble.RandomForestClassifier?

I'm trying to fit data to the cuml.ensemble.RandomForestClassifier and I keep getting the error: "The labels need to be consecutive values from 0 to the number of unique label values" I'm passing cudf.DataFrame objects into the function which have…
Jacob Dallas
  • 47
  • 1
  • 8
0
votes
1 answer

compatibility of datetime with cudf and pandas for filter datetime in python

I want to test cudf but stuck with a first simple task of filtering by datetime. Code works perfect with pandas, but not with cudf. import pandas as pd #import cudf as pd import time import datetime import dateutil if __name__ == "__main__": …
dirk8b
  • 1
  • 1
0
votes
0 answers

How do you do a grid search with cuml without a datatype error?

I tried doing a grid search with cuml. (rapids 21.10) I get a cupy conversion error. This doesn't happen if I build the model with the same dataset without a grid search. It also works doing it with the Data not lying in Videomemory, but it is then…
0
votes
1 answer

Why do I get a CUDA memory error when using RAPIDS in WSL?

I installed WSL 2 (5.10.60.1-microsoft-standard-WSL2) under Windows 21H2 (19044.1348) and using NVidia driver 510.06 with a pascal GPU (1070). I use the default ubuntu version in WSL (20.04.3 LTS) I tried both docker and anaconda versions. I can run…
0
votes
0 answers

cuDF rolling UDF not working with cuPY functions

I am trying to write a cuDF-UDF which computes the pearson auto correlation with lag==1 of a cuDF series. I have defined the following UDF: import cupy as cp def cuda_corr(x): xx=x[:-1] yy=x[1:] coef=cp.corrcoef(xx,y=yy, rowvar=False) …
0
votes
1 answer

what is the most efficient way to do `diff` for a `cudf`

The rapids.ai cudf type is somewhat compatible with pandas, but here is a strange incompatibility. cudf.Series has a .diff() method, but a cudf.DataFrame does not appear to. This is super-annoying (consider, for example, a data frame of stock…
Igor Rivin
  • 4,632
  • 2
  • 23
  • 35