Questions tagged [r-ranger]
69 questions
15
votes
1 answer
Predicted probabilities in R ranger package
I am trying to build a model in R with random forest classification. (By editing the code by Ned Horning) I first used randomForest package but then found ranger, which promises faster calculations.
At first, I used the code below to get predicted…

Batuhan Kavlak
- 337
- 3
- 9
3
votes
0 answers
Why is tidymodels with a ranger engine so much slower than ranger?
I'm taking a first look at tidymodels. My alternative for the current project would be non-tidyfied ranger. On a test run, classification random forest with tidymodels using the ranger engine is much slower than hand-held ranger (approximately ten…

lambdatau
- 31
- 3
3
votes
0 answers
Extracting leaf indices that each sample was assigned to in the forest from random forests (RF) in R
I'm trying to transpile code from Python to R in order to do supervised dimensionality reduction with Random Forests and UMAP following instructions from this blog post.
I need to get an array that contains the leaf indices that each sample was…

Matthew J. Oldach
- 618
- 8
- 24
2
votes
1 answer
Fine-tuning hyperparameters of a Random Forest model: number of trees
I am using the caret package to tune a Random Forest (RF) model using ranger. Because in the ranger package I can't tune the numer of trees, I am using the caret package. The metric to find the optimal number of trees is R-Squared. The range of…

Nikos
- 426
- 2
- 10
2
votes
1 answer
Running Shiny App locally shows "Error: sample_fraction too small, no observations sampled. Ranger will EXIT now"
I'm trying to run my Shiny App locally in RStudio but I always get this error message:
> shinyApp(ui, server)
Listening on http://127.0.0.1:5603
Error: sample_fraction too small, no observations sampled. Ranger will EXIT now.
Warning: Error in…

TomR
- 81
- 7
2
votes
1 answer
How to save a ranger model in mlr3 without data?
I have created a ranger model using mlr3 library. I saved this model to my machine using following command. The created file is huge in size. The saved file also has the data along with the model. Is there a way to only save the model without the…

Saurabh
- 1,566
- 10
- 23
2
votes
1 answer
plotting tidymodel rsults with roc_curve() receives numeric vs. character error
I am teaching myself how to use the excellent tidymodels collection of packages to practice machine learning.
In the below example, I am basically trying to reproduce Julie Sigle's blog post here (https://juliasilge.com/blog/water-sources/) on using…

alejandro_hagan
- 843
- 2
- 13
2
votes
0 answers
range of meaningful values for ranger importance using impurity_corrected
Using ranger R package, while obtaining importance using 'impurity_corrected' option, I get some importance values to be -ve.
My understanding was that the importance values would be from 0 to 1, with 0 meaning not important features (i.e. not…

abhivij
- 117
- 8
2
votes
0 answers
Getting error when predicting Random forest AUC
I am using the r package caret and ranger to develop a classifier to predict the risk of dying, but I am having trouble calculating AUC:
I am aware that I need to set probability = TRUE when training the model, however, I get an error…

Lærke Kjær
- 21
- 3
2
votes
0 answers
R: how to get the same (high-quality) results from ranger using aligned setting for h2o(.ai) randomForest
tl;df What setting in either R::ranger or h2o.ai::randomForest can account for the very different performances on the exact same data?
Background:
I'm trying to classify using a somewhat strongly imbalanced dataset, and the measure-of-goodness…

EngrStudent
- 1,924
- 31
- 46
2
votes
1 answer
Missing data after step_naomit in fit_resamples
I am currently applying the following recipe and workflow in order to fit a Random Forest using 5 folds cross validation using fit_resamples. The workflow looks something like this:
library(tidymodels)
# import data and convert response to…

anddt
- 1,589
- 1
- 9
- 26
2
votes
0 answers
How do you use more available cores when using DoParallel to tune models on tidymodel
I'm tuning some random forest models using ranger in tidymodels. I have a fairly large dataset with many columns. As a result, I set up a digital ocean droplet for tuning/trainng using instructions from Danny Foster's article: R on Digital Ocean.…

Mutuelinvestor
- 3,384
- 10
- 44
- 75
2
votes
2 answers
SHAP Importance for Ranger in R
Having a binary Classification problem:
how would be possible to get the Shap Contribution for variables for a Ranger model?
Sample data:
library(ranger)
library(tidyverse)
# Binary Dataset
df <- iris
df$Target <- if_else(df$Species ==…

PeCaDe
- 277
- 1
- 8
- 33
2
votes
1 answer
How to figure out which column names are illegal in ranger?
Here is a ranger call:
rf_fit <-
rf_mod %>%
fit(my_outcome_factor ~ ., data = data_train)
and the output:
Error in parse.formula(formula, data, env = parent.frame()) :
Error: Illegal column names in formula interface. Fix column names or…

dfrankow
- 20,191
- 41
- 152
- 214
1
vote
1 answer
Variable Importance P-Values
Can the importance_pvalues (https://rdrr.io/cran/ranger/man/importance_pvalues.html) command be used via mlr3? In other words, can I indicate that I would like the p-values outputted in my call to the learner? If not, how would I go about extracting…

DeLuca Lab
- 13
- 2