0

I was able to connect a model trained on Catboost with my bot, but when I enter data for analysis, an error pops up.

raise CatBoostError("Invalid {}[{}] = {} value: index must be < {}.".format(features_name, indx, feature, features_count)) _catboost.CatBoostError: Invalid cat_features[1] = 8 value: index must be < 8.

This is what the model code looks like:

`import pandas as pd

df = pd.read_excel(r'D:\Programming\Telegram_counter_bot\TG_data.xlsx')

df = df.head(1000)

df = df.rename(columns = {'Views (k)':'Views'})

df['Time'].replace('ะก','C', inplace=True)

df = df.fillna(method='ffill')



from sklearn.model_selection import train_test_split

train, test = train_test_split(df, test_size=0.2,random_state=42)
val, test = train_test_split(test, train_size=0.5,random_state=42)



one_hot_tr = pd.get_dummies(train['Time'])
one_hot_t = pd.get_dummies(test['Time'])
one_hot_v = pd.get_dummies(val['Time'])

train = pd.concat([train, one_hot_tr], axis=1)
test = pd.concat([test, one_hot_t], axis=1)
val = pd.concat([val, one_hot_v], axis=1)



X = ['Picture', 'Picture_text', 'Video', 'Header', 'Characters',
   'Text formatting', 'Emoji','A',
   'B', 'C', 'D']

cat_features = ['A','B', 'C', 'D']

y = ['Views']


from catboost import CatBoostRegressor

from catboost import Pool

train_data = Pool(data=train[X],
              label=train[y],
              cat_features=cat_features
             )

valid_data = Pool(data=val[X],
              label=val[y],
              cat_features=cat_features
             )

train_full = pd.concat([train,val])

train_full_data = Pool (train_full[X],
                    label=train_full[y],
                    cat_features=cat_features)

params = {'iterations' : 582,
      'eval_metric': 'MAE',
      'loss_function': 'MAE',
      'random_seed': 42,
      'verbose': 100,
      'learning_rate': 0.01}

model = CatBoostRegressor(**params)

model.fit(train_full_data)

y_pred = model.predict(test[X])`

And here is the function in the bot that is responsible for it:

`def process_user_data(params):

data = pd.DataFrame([params], columns=['Picture', 'Picture_text', 'Video', 'Header',        'Characters',
   'Text formatting', 'Emoji', 'Time'])

one_hot = pd.get_dummies(data['Time'])
data = pd.concat([data, one_hot], axis=1)

data = data.drop('Time', axis=1)

return data`

I tried to replace A,B,C,D with indices in cat_features. I tried to enter data into the bot using different approaches, but the format / analyze 1; 0; 0; 1; 159; 1; 1; D turned out to be the most correct.

0 Answers0