1

I would like to convert a pandas data frame to a cudf data frame on linux.

My code:

 import cudf
 import pandas as pd

 test_data = {
            'session_id':[1, 2],
             'val' : [1.1, 2.2]
        }
 pd_df = pd.DataFrame(test_data)
 pd_df.info()

 <class 'pandas.core.frame.DataFrame'>
 RangeIndex: 2 entries, 0 to 1
 Data columns (total 2 columns):
 #   Column      Non-Null Count  Dtype  
 ---  ------      --------------  -----  
 0   session_id  2 non-null      int64  
 1   val         2 non-null      float64
 dtypes: float64(1), int64(1)
 memory usage: 160.0 bytes

 cudf.DataFrame.from_pandas(pd_df) # error

 RuntimeError: cuDF failure at: /opt/rapids/src/cudf/cpp/src/interop/from_arrow.cu:86: Unsupported type_id conversion to cu`df

I have checked the file at cudf/cpp/src/interop/from_arrow.cu

 case arrow::Type::INT64: return data_type(type_id::INT64);
 case arrow::Type::DOUBLE: return data_type(type_id::FLOAT64);

The data types should be supported well. Did I miss anything ? thanks

I have tried to do the conversion from pandas data frame to cudf df but it failed even though it should work well based on the source code.

UPDATE:

Platform: Debian 4.19.269-1 Python version: 3.8.10

mtnt
  • 31
  • 5
  • Something about your environment may be the root cause, as this this unexpected behavior and not reproducible in a standard environment. If you include the full environment creation + activation commands you used, we may be able to provide guidance. – Nick Becker Apr 16 '23 at 19:25
  • @NickBecker, I have added some updates, please let me know what others that help for this problem? thanks – mtnt Apr 17 '23 at 00:01
  • As this will not be reproducible for most people, I recommend including your full conda or pip-based environment and show that you are running the code using Python from the environment. It's probably also worth creating a fresh conda/pip environment and installing the necessary packages to see if this resolves your issue. – Nick Becker Apr 17 '23 at 03:10

0 Answers0