I'm trying to create a recipe (preprocess for Xgboost model) which will use a custom metric (dice).
Here is my code :
Dice function and distance matrix
dice <- function(x,y){
n1 <- sum(x==1 & y==0); n2 <- sum(x==0 & y==1)
n3 <- sum(x==1 & y==1)
return((n1+n2)/(n1+n2+2*n3))
}
X_dm <- (proxy::dist(X, dice))
Workflow
xgb_spec <- boost_tree(
trees = 1000,
tree_depth = tune(), min_n = tune(),
loss_reduction = tune(),
sample_size = tune(), mtry = tune(),
learn_rate = tune(),
) %>%
set_engine("xgboost") %>%
set_mode("classification")
trained <- res.MCA[["call"]][["Xtot"]][indextrain,] %>%
data.frame() %>%
mutate(class = Y_train)
#preprocess + formula
umap_rec <-
recipe(class ~ ., data = trained) %>%
step_downsample(under_ratio = tune()) %>%
step_umap(
all_predictors(),
outcome = "class",
num_comp = tune(),
neighbors = tune(),
min_dist = tune(),
options = list(
target_weight = 0.5,
X = X_dm)
)
#pipeline
wf <- workflow() %>%
add_recipe(umap_rec) %>%
add_model(xgb_spec)
#
umap_param <-
parameters(wf) %>%
update(mtry = mtry(c(1,86))
)
xgb_grid <- grid_latin_hypercube(
umap_param,
size = 10
)
vb_folds <- vfold_cv(trained,v=3)
cl <- makePSOCKcluster(7)
registerDoParallel(cl)
umap_tune_grid <- wf %>%
tune_grid(
resamples = vb_folds,
grid = xgb_grid,
param_info = umap_param,
control = control_grid(verbose = FALSE),
metrics = metric_set(f_meas)
)
stopCluster(cl)
But I get this error :
Error in `estimate_tune_results()`:
! All of the models failed. See the .notes column.
Backtrace:
1. umap_tune_grid %>% select_best()
3. tune:::select_best.tune_results(.)
5. tune:::show_best.tune_results(x, metric = metric, n = 1)
6. tune::estimate_tune_results(x)
The error seems to come from the subargument 'X' equal to X_dm in step_umap function. I don't know how to take into consideration the custom dice metric in step_umap.
How can I do that if it is possible of course ?
Have a good day, Thanks in advance