Questions tagged [r-ranger]

69 questions
15
votes
1 answer

Predicted probabilities in R ranger package

I am trying to build a model in R with random forest classification. (By editing the code by Ned Horning) I first used randomForest package but then found ranger, which promises faster calculations. At first, I used the code below to get predicted…
Batuhan Kavlak
  • 337
  • 3
  • 9
3
votes
0 answers

Why is tidymodels with a ranger engine so much slower than ranger?

I'm taking a first look at tidymodels. My alternative for the current project would be non-tidyfied ranger. On a test run, classification random forest with tidymodels using the ranger engine is much slower than hand-held ranger (approximately ten…
lambdatau
  • 31
  • 3
3
votes
0 answers

Extracting leaf indices that each sample was assigned to in the forest from random forests (RF) in R

I'm trying to transpile code from Python to R in order to do supervised dimensionality reduction with Random Forests and UMAP following instructions from this blog post. I need to get an array that contains the leaf indices that each sample was…
2
votes
1 answer

Fine-tuning hyperparameters of a Random Forest model: number of trees

I am using the caret package to tune a Random Forest (RF) model using ranger. Because in the ranger package I can't tune the numer of trees, I am using the caret package. The metric to find the optimal number of trees is R-Squared. The range of…
Nikos
  • 426
  • 2
  • 10
2
votes
1 answer

Running Shiny App locally shows "Error: sample_fraction too small, no observations sampled. Ranger will EXIT now"

I'm trying to run my Shiny App locally in RStudio but I always get this error message: > shinyApp(ui, server) Listening on http://127.0.0.1:5603 Error: sample_fraction too small, no observations sampled. Ranger will EXIT now. Warning: Error in…
TomR
  • 81
  • 7
2
votes
1 answer

How to save a ranger model in mlr3 without data?

I have created a ranger model using mlr3 library. I saved this model to my machine using following command. The created file is huge in size. The saved file also has the data along with the model. Is there a way to only save the model without the…
Saurabh
  • 1,566
  • 10
  • 23
2
votes
1 answer

plotting tidymodel rsults with roc_curve() receives numeric vs. character error

I am teaching myself how to use the excellent tidymodels collection of packages to practice machine learning. In the below example, I am basically trying to reproduce Julie Sigle's blog post here (https://juliasilge.com/blog/water-sources/) on using…
alejandro_hagan
  • 843
  • 2
  • 13
2
votes
0 answers

range of meaningful values for ranger importance using impurity_corrected

Using ranger R package, while obtaining importance using 'impurity_corrected' option, I get some importance values to be -ve. My understanding was that the importance values would be from 0 to 1, with 0 meaning not important features (i.e. not…
abhivij
  • 117
  • 8
2
votes
0 answers

Getting error when predicting Random forest AUC

I am using the r package caret and ranger to develop a classifier to predict the risk of dying, but I am having trouble calculating AUC: I am aware that I need to set probability = TRUE when training the model, however, I get an error…
2
votes
0 answers

R: how to get the same (high-quality) results from ranger using aligned setting for h2o(.ai) randomForest

tl;df What setting in either R::ranger or h2o.ai::randomForest can account for the very different performances on the exact same data? Background: I'm trying to classify using a somewhat strongly imbalanced dataset, and the measure-of-goodness…
EngrStudent
  • 1,924
  • 31
  • 46
2
votes
1 answer

Missing data after step_naomit in fit_resamples

I am currently applying the following recipe and workflow in order to fit a Random Forest using 5 folds cross validation using fit_resamples. The workflow looks something like this: library(tidymodels) # import data and convert response to…
anddt
  • 1,589
  • 1
  • 9
  • 26
2
votes
0 answers

How do you use more available cores when using DoParallel to tune models on tidymodel

I'm tuning some random forest models using ranger in tidymodels. I have a fairly large dataset with many columns. As a result, I set up a digital ocean droplet for tuning/trainng using instructions from Danny Foster's article: R on Digital Ocean.…
Mutuelinvestor
  • 3,384
  • 10
  • 44
  • 75
2
votes
2 answers

SHAP Importance for Ranger in R

Having a binary Classification problem: how would be possible to get the Shap Contribution for variables for a Ranger model? Sample data: library(ranger) library(tidyverse) # Binary Dataset df <- iris df$Target <- if_else(df$Species ==…
PeCaDe
  • 277
  • 1
  • 8
  • 33
2
votes
1 answer

How to figure out which column names are illegal in ranger?

Here is a ranger call: rf_fit <- rf_mod %>% fit(my_outcome_factor ~ ., data = data_train) and the output: Error in parse.formula(formula, data, env = parent.frame()) : Error: Illegal column names in formula interface. Fix column names or…
dfrankow
  • 20,191
  • 41
  • 152
  • 214
1
vote
1 answer

Variable Importance P-Values

Can the importance_pvalues (https://rdrr.io/cran/ranger/man/importance_pvalues.html) command be used via mlr3? In other words, can I indicate that I would like the p-values outputted in my call to the learner? If not, how would I go about extracting…
DeLuca Lab
  • 13
  • 2
1
2 3 4 5