0

Good morning, i want to test the hdbscan (Hierarchical Density-Based Spatial Clustering of Applications w/ Noise)using GPU so i should use the framework rapids. When i tried to follow the steps described here https://colab.research.google.com/drive/1rY7Ln6rEE1pOlfSHCYOVaqt8OvDO35J0#forceEdit=true&sandboxMode=true&scrollTo=EwaJSKuswsNi taken from Rapids website: https://rapids.ai/start.html i get the following error when i run the code of the function CUDF:

import cudf
import io, requests

# download CSV file from GitHub
url="https://github.com/plotly/datasets/raw/master/tips.csv"
content = requests.get(url).content.decode('utf-8')

# read CSV from memory
tips_df = cudf.read_csv(io.StringIO(content))
tips_df['tip_percentage'] = tips_df['tip']/tips_df['total_bill']*100

# display average tip by dining party size
print(tips_df.groupby('size').tip_percentage.mean())
ValueError                                Traceback (most recent call last)
<ipython-input-1-a95ca25217db> in <module>()
----> 1 import cudf
      2 import io, requests
      3 
      4 # download CSV file from GitHub
      5 url="https://github.com/plotly/datasets/raw/master/tips.csv"

2 frames
/usr/local/lib/python3.7/site-packages/cudf/_lib/__init__.py in <module>()
      2 import numpy as np
      3 
----> 4 from . import (
      5     avro,
      6     binaryop,

cudf/_lib/avro.pyx in init cudf._lib.avro()

cudf/_lib/column.pyx in init cudf._lib.column()

cudf/_lib/scalar.pyx in init cudf._lib.scalar()

cudf/_lib/interop.pyx in init cudf._lib.interop()
ValueError: pyarrow.lib.Codec size changed, may indicate binary incompatibility. Expected 48 from C header, got 40 from PyObject 

could you please help me.

thanks an advance

aydi
  • 11
  • 2

1 Answers1

0

Colab made some enhancements this week that affected the RAPIDS installation process. Work toward a resolution is active, and progress is being tracked in this issue (which includes a potential workaround)

Nick Becker
  • 4,059
  • 13
  • 19