0

I am trying to learn how to implement LIME library on categorical features. For that I am trying to replicate a Git notebook on my data set. However I am facing an error for which the same function worked for my reference notebook.

Here is the function:

class catboost_predict_proba_wrapper:
    
    def __init__(self,cb_model,column_names, list_cat_feat_index):
        self.model = cb_model
        self.list_cat_feat_index = list_cat_feat_index
        self.column_names = column_names
        
    def predict_proba(self,this_array):
        shape_tuple = np.shape(this_array)
        if len(shape_tuple) == 1:
            this_array = this_array.reshape(1,-1)
        self.pandas_df = pd.DataFrame(data=this_array,columns=self.column_names)
        self.data_pool = catboost.Pool(self.pandas_df,cat_features=cat_features)
        self.predictions = self.cb_model.predict_proba(self.data_pool)
        return self.predictions

I am passing cat_features as a list of categorical feature as well as I tried to pass a list of index values of categorical columns. But I am still facing an error. The error states,

CatBoostError: Invalid type for cat_feature[non-default value idx=0,feature_idx=0]=1.0 : 
cat_features must be integer or string, real number values and NaN values should be 
converted to string.

How to resolve this?

The notebook which I am refering to solve my problem:

https://github.com/JNandez/catboost_lime/blob/master/kobe_lime_catboost.ipynb

Alexander L. Hayes
  • 3,892
  • 4
  • 13
  • 34
Asmita
  • 13
  • 6

0 Answers0