2

Here is the code

library(mlr)
library(xgboost)
library(iml)
data("iris")
tsk = makeClassifTask(data = iris, target = "Species")
lrn = makeLearner("classif.xgboost",predict.type = "prob")
mod = mlr:::train(lrn, tsk)
X = iris[which(names(iris) != "Species")]
predictor = Predictor$new(mod, data = X, y = iris$Species)
imp = FeatureImp$new(predictor, loss = "ce")

I got the following error

imp = FeatureImp$new(predictor, loss = "ce")
Warning in predict.WrappedModel(model, newdata = newdata) :
  Provided data for prediction is not a pure data.frame but from class data.table, 
  hence it will be converted.

Error in estimate.feature.imp(feature, data.sample = data.sample, y = y, : task 1 failed - "Feature names stored in object and newdata are different!"

I tried checking the feature names in model and data but both of them are similar. Hence, I dont understand what exactly is this error "Feature names stored in object and newdata are different!"

colnames(X)
[1] "Sepal.Length" "Sepal.Width"  "Petal.Length" "Petal.Width" 

mod$learner.model$feature_names    
[1] "Sepal.Length" "Sepal.Width"  "Petal.Length" "Petal.Width" 
phiver
  • 23,048
  • 14
  • 44
  • 56
user19568
  • 21
  • 3

1 Answers1

0

This is a xgboost issue: https://github.com/dmlc/xgboost/issues/1809

Its about the order of the variables.

X = X[mod$learner.model$feature_names]

should solve it. I faced the same issue some days ago.

Edit: The error still occurs, probably because there is again some disordering due to the shuffling by iml. But the fix is definitely the way to go, as this error also occurs for predict calls using xgboost.

pat-s
  • 5,992
  • 1
  • 32
  • 60