Questions tagged [tidymodels]

The tidymodels framework is a collection of R packages for modeling and machine learning using tidyverse principles.

The tidymodels framework is a "meta-package" for modeling and statistical analysis that shares the underlying design philosophy, grammar, and data structures of the . It includes a core set of packages that are loaded on startup, and extra packages that are installed along with tidymodels but not attached on startup. The tidymodels framework provides tooling for modeling tasks including supervised machine learning (predictive modeling), unsupervised machine learning, time series analysis, text analysis, and more.

Resources

Related tags

613 questions
1
vote
0 answers

Using glmnet engine in tidymodels to fit models with percent data as response

I am interested in using penalized regression (LASSO) with the glmnet engine in tidymodels to fit a model with a response variable that is continuous and bound between 0 and 1. I am familiar with how to fit a model to this type of data using a more…
Bryant
  • 11
  • 2
1
vote
1 answer

Use Purrr to filter and estimate the same model specification multiple times

I have a dataset that I want to use to estimate some linear models. The data can be replicated with the code below. set.seed(123) df <- tibble( location= c(rep("A",5),rep("B",5), rep("C",5), rep("D",5), rep("E",5), rep("F",5), …
1
vote
1 answer

tidymodels: loss_accuracy provides no variable importance results

Using the iris dataset, a knn-classifier was tuned with iterative search for multiple classification. However, using loss accuracy in DALEX::model_parts() for variable importance, provides empty results. I would appreciate any ideas. Thank you so…
1
vote
1 answer

R kernelshap package with tidymodels with classification

Problem when trying to produce shap values for classification problem using tidymodels. hen i try to calculate shap values after training my model in tidymodels following steps on this site https://github.com/ModelOriented/kernelshap i cant…
1
vote
2 answers

How to use broom::tidy for Tobit models?

Dear Stackoverflow community, I am struggling with using the tidy function from the broom package. I need this function in the context of a multiple imputation. You can see here a reprex example using a ggplot2 dataset. library(ggplot2, quietly =…
HuMaN90
  • 13
  • 2
1
vote
1 answer

How to avoid inconsistent column name of predict() outputs?

Context: The tidymodels meta-package allows to streamline machine learning methods. I am trying to use it on my data and explore the model with DALEX package. Problem: When I use tune::last_fit() or DALEXtra::explain_tidymodels() on the workflow, I…
Paul
  • 2,850
  • 1
  • 12
  • 37
1
vote
1 answer

In R tidymodels how do I set the default level for evaluating a logistic model

I would like to tell yardstick that the default level for my logistic model is the second level. I know I can specify individual statistics with event_level = "second" but I would prefer to set the event_level with a global option. When I try it…
itsMeInMiami
  • 2,324
  • 1
  • 13
  • 34
1
vote
1 answer

Regression trees with tidymodels

When attempting to use Regression Trees, how do you determine if/ when to use tune_grid() versus fit_resamples()? I tried these two things: 1. using tune_grid tune_spec<- decision_tree(min_n= tune(), tree_depth= tune(), cost_complexity=tune()) %>%…
1
vote
1 answer

Error in `purrr::map()`, unable to use step_select_*() from colino and workflow_map() together

Trying to use different step_select functions from the colino package before fitting multiple models with workflow_set(). Apparently using a step_select function as preprocessing in the recipe make it impossible to use worflow_map() after. The exact…
1
vote
1 answer

broom::augment errors with recipes::step_log

I am trying to use the tidymodels framework to predict outcome based on predictor. I want to use log(outcome) ~ predictor as my model, but outcome has zeroes. Therefore, I use recipes::step_log(outcome, offset = 0.0001). However, broom::augment()…
joshbrows
  • 77
  • 7
1
vote
1 answer

How to evaluate joint importance of two features in a model (random forest) using R package such as VIP or DALEXtra?

I know to use these packages (VIP etc) with tidymodels to evaluate individual feature contribution/importance for a model such as random forest. But I'd like to know to evaluate a combined or joint importance for two or more features? Probably I…
Xiaokuan Wei
  • 135
  • 6
1
vote
1 answer

log transform outcome variable in tidymodels workflow

I'm having a hard time getting a tidymodels workflow to make new predictions. Specifically, I want to log transform my outcome variable as part of the workflow, but when it comes time to predict new observations this piece of the workflow…
JmPearl
  • 11
  • 2
1
vote
1 answer

Error in `step_log()`: When trying to make predictions with my model

I'm trying to make predictions with my testing data using my finalized workflow. But whenever I try using the predict function, it gives me this error: Error in `step_log()`: ! The following required column is missing from `new_data` in step…
Chelsea Lu
  • 11
  • 1
1
vote
1 answer

How to get a variable importance graph from a random forest using Tidymodels and vip

The dataset I use is the following (mushroom) : https://archive.ics.uci.edu/ml/datasets/mushroom I specified the following recipe, model and workflow of a Random Forest using Tidymodels : df_recipe_mixt <- df_train |> recipe(class ~ cap_diameter…
Smorg
  • 55
  • 4
1
vote
1 answer

What does this actual vs. predicted plot mean?

I am currently testing different models to get the best predicted outcome. My measure for model effectiveness is RMSE. I am using the tidymodels package to go through this, and have used regular grids to tune models with 5-fold cross validation and…
nth1824
  • 27
  • 3