0

I'm attempting to gather ID level drivers from my XGBoost classification model using LIME and I'm running into some odd errors. I'm using this link as a reference.

Here is the overall code that I'm using:

explainer = lime.lime_tabular.LimeTabularExplainer(Xs_train.values, class_names = [1.0, 0.0], kernel_width = 3)

predict_fn_xgb = lambda x: trained_model.predict_proba(x).astype(float)
data_point = Xs_val.values[5]

exp = explainer.explain_instance(data_point, predict_fn_xgb, num_features = 10)
exp.show_in_notebook(show_all = False)

Key:

  • trained_model: trained xgboost classification model
  • class names: This is a binary classification model
  • Xs_train: This is a (73548, 84) dimension training set. This was used to build the training_model
  • Xs_val: This is a (4910, 84) dimension training set. The columns are the same with the training and validation set.
  • data_point: one specific validation point

Now, when I run this code, I get the following error:

ValueError: expected res_time, email_views...training data did not have the following fields: f6, f49, f34, f21,...

I don't know where the f# column names are coming from. Seems really bizarre and I believe I'm following the example correctly.

Any help would be much appreciated. Let me know if any additional information is required.

madsthaks
  • 377
  • 1
  • 6
  • 16

1 Answers1

0

You haven't provided any information about the fields in your dataset.

However, it seems you're not passing 'feature_names' to LimeTabularExplainer. Try doing that. Good luck

atinjanki
  • 483
  • 3
  • 13