0

I have built a deep neural network which classifies fraudulent transactions. I am trying to use LIME for explanation, but am facing an error from the interpretor.explain_instance() function.

The complete code is as follows:

import lime
from lime import lime_tabular

interpretor = lime_tabular.LimeTabularExplainer(
    training_data=x_train_scaled,
    feature_names=X_train.columns,
    mode='classification'
)

exp = interpretor.explain_instance(
    data_row=x_test_scaled[:1], ##new data
    predict_fn=model.predict,num_features=11
)
xp.show_in_notebook(show_table=True)

This throws the error:


--
IndexError                                Traceback (most recent call last)
/tmp/ipykernel_33/1730959582.py in <module>
      1 exp = interpretor.explain_instance(
      2     data_row=x_test_scaled[1], ##new data
----> 3     predict_fn=model.predict
      4 )
      5 

/opt/conda/lib/python3.7/site-packages/lime/lime_tabular.py in explain_instance(self, data_row, predict_fn, labels, top_labels, num_features, num_samples, distance_metric, model_regressor)
    457                     num_features,
    458                     model_regressor=model_regressor,
--> 459                     feature_selection=self.feature_selection)
    460 
    461         if self.mode == "regression":

/opt/conda/lib/python3.7/site-packages/lime/lime_base.py in explain_instance_with_data(self, neighborhood_data, neighborhood_labels, distances, label, num_features, feature_selection, model_regressor)
    180 
    181         weights = self.kernel_fn(distances)
--> 182         labels_column = neighborhood_labels[:, label]
    183         used_features = self.feature_selection(neighborhood_data,
    184                                                labels_column,

IndexError: index 1 is out of bounds for axis 1 with size 1
Gourab
  • 81
  • 1
  • 1
  • 7

2 Answers2

1

Adding labels=(0,) in exp = eplainer.explain_instance() might resolve your issue.

exp = interpretor.explain_instance(
data_row=x_test_scaled[:1], ##new data
predict_fn=model.predict,
num_features=11,
labels=(0,)
)

I had a similar issue with breast cancer data trying to predict benign or malignant tumors. The column that contains the recorded sample is titled benign_0_malignant_1 with either 0 or 1 placed in each row.

0

I think the problem is that you are passing in a 2D array, but according to the docs explain_instance() is expecting the instance as a 1D array.

Note that a single-row slice from a 2D array is itself 2D:

>>> import numpy as np
>>> arr = np.array([[1, 2], [3, 4], [5, 6]])
>>> arr[:1]
array([[1, 2]])

The other issue is the prediction function, which is expected to produce an array of probabilities, not a single prediction. Again, the docs explain (my emphasis):

classifier_fn – classifier prediction probability function, which takes a numpy array and outputs prediction probabilities. For ScikitClassifiers , this is classifier.predict_proba.

To fix these things, use a simple index into x_test_scaled instead of a slice, and pass model.predict_proba as the prediction function:

exp = interpretor.explain_instance(data_row=x_test_scaled[0],
                                   predict_fn=model.predict_proba,
                                   num_features=11
                                   )
Matt Hall
  • 7,614
  • 1
  • 23
  • 36
  • Hi @kwinkunks I tried what you suggested and now the new error is as follows: "IndexError: index 1 is out of bounds for axis 1 with size 1".Any suggestion would be really nice. – Gourab Jun 09 '22 at 11:53
  • Not sure... what's the shape of `x_test_scaled`? Are you asking for index of 0 or 1? Main thing is to make sure you're passing in a 1D array, so check its size in the REPL before giving it to `explain_instance()`. – Matt Hall Jun 09 '22 at 16:40
  • shape of x_test_scaled is (27960, 11). and while passing the data point to the explain_instance() function i used x_test_scaled[1] which has a shape (11,) – Gourab Jun 10 '22 at 08:27
  • Maybe just double check, eg print the shape again right before you access it, or share the entire error message to show exactly what is throwing it. – Matt Hall Jun 10 '22 at 13:23
  • I rechecked and the dimensions are the same as i mentioned earlier.The complete error code is: IndexError: index 1 is out of bounds for axis 1 with size 1" – Gourab Jun 12 '22 at 05:23
  • Show all of the *entire* error message by editing your post. – Matt Hall Jun 12 '22 at 14:42
  • Hi,I have added the entire error message,please have a look.Thanks – Gourab Jun 20 '22 at 06:13
  • I read the docs again and you have to pass a prediction function that produces probabilities. I edited my answer accordingly. – Matt Hall Jun 20 '22 at 11:37