Questions tagged [tidymodels]

The tidymodels framework is a collection of R packages for modeling and machine learning using tidyverse principles.

The tidymodels framework is a "meta-package" for modeling and statistical analysis that shares the underlying design philosophy, grammar, and data structures of the . It includes a core set of packages that are loaded on startup, and extra packages that are installed along with tidymodels but not attached on startup. The tidymodels framework provides tooling for modeling tasks including supervised machine learning (predictive modeling), unsupervised machine learning, time series analysis, text analysis, and more.

Resources

Related tags

613 questions
3
votes
1 answer

Balanced log loss function in yardstick

Can someone help me figure out how to create a balanced logarithmic loss function in yardstick for use in a tidymodels pipeline? I looked up the documentation on creating custom metrics and I was able to create straightforward custom regression and…
JaredS
  • 242
  • 2
  • 5
  • 16
3
votes
1 answer

How to enable parallelization in tidymodels stacks::control_stack_grid()

I am attempting to use the tidymodels stacks package to perform ensemble modeling. Following the instructions provided in their article, I was able to reproduce the example successfully. However, when I added parallelization during hyperparameter…
littleworth
  • 4,781
  • 6
  • 42
  • 76
3
votes
1 answer

How to use %>% and calculate multiple metrics in R?

I have a tibble and I am trying to calculate multiple metrics. library(tidymodels) price = 1:50 prediction = price * 0.9 My_tibble = tibble(price=price, prediction=prediction) # The following code can calculate the rmse My_tibble %>% rmse(truth…
Yang Yang
  • 858
  • 3
  • 26
  • 49
3
votes
1 answer

How to save parsnip/agua based H2O object and retrieve it again

I have the following script using tidymodels' agua package: library(tidymodels) library(agua) library(ggplot2) theme_set(theme_bw()) h2o_start() data(concrete) set.seed(4595) concrete_split <- initial_split(concrete, strata =…
littleworth
  • 4,781
  • 6
  • 42
  • 76
3
votes
0 answers

Pins + Vetiver vs MLflow which one to choose for MLOps

I am a big fan boy of tidymodels and played around with vetiver + pins in R and Python in order to not only develop models but actually deploy them. However, if you are looking for tools that support in the area of MLOps, sooner or later you will…
Mischa
  • 137
  • 8
3
votes
1 answer

How do I extract the classification tree from this parsnip model in R?

I am working through 'Machine Learning & R Expert techniques for predictive modeling' by Brett Lantz. I am using the tidymodels suite as I try the example modeling exercises in R. I am working through chapter 5 in which you build a decision tree…
3
votes
1 answer

Tidymodels: How to extract importance from training data

I have the following code, where I do some grid search for different mtry and min_n. I know how to extract the parameters that give the highest accuracy (see second code box). How can I extract the importance of each feature in the training dataset?…
Orestis
  • 53
  • 5
3
votes
2 answers

How to save Tidymodels Lightgbm model for reuse

I have the following codes for creating a tidymodels workflow with lightgbm model. However, there is some problem when I tried to save into a .rds object and…
3
votes
1 answer

Create a multivariate matrix in tidymodels recipes::recipe()

I am trying to do a k-fold cross validation on a model that predicts the joint distribution of the proportion of tree species basal area from satellite imagery. This requires the use of the DiricihletReg::DirichReg() function, which in turn…
Sean McKenzie
  • 707
  • 3
  • 13
3
votes
1 answer

Why does the `class::knn()` function give different results from `kknn::kknn()` with a fixed k?

I am trying to convert the base R code in Introduction to Statistical Learning into the R tidymodels ecosystem. The book uses class::knn() and tidymodels uses kknn::kknn(). I got different results when doing knn, with a fixed k. So I stripped out…
itsMeInMiami
  • 2,324
  • 1
  • 13
  • 34
3
votes
1 answer

Verify model assumptions with tidymodels

Outside of the tidymodels universe, it's easy to verify model assumptions. For example with linear regression (function lm), the package performance create understandable graphics and easy functions (check_heteroscedasticity()) to verify assumptions…
Polo
  • 33
  • 2
3
votes
1 answer

How to fit a model without an intercept using R tidymodels workflow?

How can I fit a model using this tidymodels workflow? library(tidymodels) workflow() %>% add_model(linear_reg() %>% set_engine("lm")) %>% add_formula(mpg ~ 0 + cyl + wt) %>% fit(mtcars) #> Error: `formula` must not contain the intercept…
David Rubinger
  • 3,580
  • 1
  • 20
  • 29
3
votes
2 answers

Bootstrap resampling and tidy regression models with grouped/nested data

I am trying to estimate regression slopes and their confidence intervals using bootstrapping. I would like to do it for grouped data. I was following the example at this website (https://www.tidymodels.org/learn/statistics/bootstrap/), but I…
D Kincaid
  • 167
  • 1
  • 13
3
votes
2 answers

Set tuning parameter range a priori

I know that in tidymodels you can set a custom tunable parameter space by interacting directly with the workflow object as follows: library(tidymodels) model <- linear_reg( mode = "regression", engine = "glmnet", penalty = tune() …
Marco Repetto
  • 336
  • 2
  • 15
3
votes
2 answers

Logistic regression with tidymodels: How to set event_level ="second" in last_fit()?

I am building a logistic regression model with an outcome variable with 2 categories: a_category / z_category, and I have the following questions: I am interested in predicting "z_category" using the independent variables, therefore my reference…
1 2
3
40 41