Questions tagged [tidymodels]

The tidymodels framework is a collection of R packages for modeling and machine learning using tidyverse principles.

The tidymodels framework is a "meta-package" for modeling and statistical analysis that shares the underlying design philosophy, grammar, and data structures of the . It includes a core set of packages that are loaded on startup, and extra packages that are installed along with tidymodels but not attached on startup. The tidymodels framework provides tooling for modeling tasks including supervised machine learning (predictive modeling), unsupervised machine learning, time series analysis, text analysis, and more.

Resources

Related tags

613 questions
0
votes
1 answer

Using tidymodels' SMOTE with dummies for categorical variables

I am dealing with a heavily imbalanced response variable, so my supervisor has recommended I use SMOTE in order to upsample the minority observations in my data set. The data consists of many categorical predictors and as I understand it…
O René
  • 305
  • 1
  • 12
0
votes
1 answer

How to set the splitting rule in decision_tree spec?

When creating a specficication and fitting a decision tree with tidymodels metapackage and decision_tree() function, the default splitting method/rule in rpart package for categorical data is the Gini index, which is set with the params argument of…
dzegpi
  • 554
  • 5
  • 14
0
votes
1 answer

Graphing using list of dataframes

Following up on this post here I use tidymodels to fit a regression to a grouped list of dataframes. Then I predict values into another list of dataframes. # Code from the original question library(dplyr) year <- rep(2014:2018,…
Stata_user
  • 562
  • 3
  • 14
0
votes
1 answer

tidymodels - predict() and fit() giving different model performance results when applied to the same dataset

Currently using the tidymodels framework and struggling to understand some differences in model predictions and performance results I get, specifically when I use both fit and predict on the exact same dataset (i.e. the dataset the model was trained…
0
votes
1 answer

R tidymodels xgboost Ubuntu 20.04 Error: C stack usage 7975188 is too close to the limit

I am trying to run an xgboost model through tidymodels on an Ubuntu server but am getting the following error: Resample01: preprocessor 1/1: Error: C stack usage 7977188 is too close to the limit I've tried all the solutions suggested from googling…
Pat
  • 101
  • 6
0
votes
0 answers

Use survival model in tidymodels and workflows in R

I would like to predict the time to death in a censored type of data using tidymodels, workflows and tuning. I am getting an error message when I build the recipe. set.seed(123) n <- 20 train_data <- data.frame(id=1:n, death_time…
RCchelsie
  • 111
  • 6
0
votes
1 answer

How to add a step to remove a column with constant value?

Background: I'm creating a recipe to clean and transform time-series data that will be used by multiple models. One of the steps in the recipe is to remove correlated predictors using the step_corr() function. However, due to the nature of the data…
fahmy
  • 3,543
  • 31
  • 47
0
votes
1 answer

How to handle forecast data (melt and "unmelt") generated by modeltime prediction - lost variables

below I created some fake forecast data using the tidyverse modeltime packages. I have got monthly data from 2016 and want to produce a test fc for 2020. As you can see, the data I load comes in wide format. For usage in modeltime I transform it to…
0
votes
1 answer

Calculate confidence intervals on and extract regression summary stats from tidy bootstrapped models on grouped data

I want to calculate confidence intervals on a distribution of slope estimates from bootstrapped linear regression models AND extract regression summary statistics (e.g., r.squared) for each of the bootstrapped models on grouped data. I figured out…
D Kincaid
  • 167
  • 1
  • 13
0
votes
1 answer

Tidymodels. step_impute_linear(), can it be used when every column contains NAs

My data contain >100 columns and every one of them contains NA's, and when I try to use step_impute_linear() it returns a mistake Warning message: There were missing values in the predictor(s) used to impute; imputation did not…
0
votes
1 answer

Error in UseMethod("required_pkgs") : no applicable method for 'required_pkgs' applied to an object of class "workflow"

I'm following Jan Kirenz tutorial for classification using Tidymodels. Everything so far has gone well until I try to evaluate the model using the function fit_resamples(). I keep getting the error message Error in UseMethod("required_pkgs") : no…
0
votes
1 answer

Issue using "pred_yes" column as the estimate argument to roc_curve()

When I run the below data it shows an incorrect roc_curve. Prep The below code should be run-able for anyone using r-studio. The dataframe contains characteristics of different employees regarding: performance ratings, sales figures, and whether or…
ryanmc25
  • 1
  • 1
0
votes
1 answer

Tidymodels, All models failed; error in model.frame.default and prediction from a rank-deficient fit may be misleading

I am having problems with the tidymodels-tuning that give the error and warning: warning: prediction from a rank-deficient fit may be misleading Error: Error in model.frame.default(Terms, newdata, na.action = na.action, xlev = ob... Note 1: I am…
0
votes
0 answers

struggling with error involving ridge_models

I am new to R and I have been struggling on this for awhile. Can anyone explain to me what I am doing wrong to get this error? str(bike) bike_recipe=recipe …
zack
  • 1
0
votes
1 answer

Is there a way to create a custom metric for use with tune_grid() in tidymodels that allows for a grouped data.frame/tibble?

What I'd like to do I am trying to build a model in tidymodels that will predict the efficacy of drugs on cell lines (like bacteria). The model will rank drugs by efficacy for a given cell line, so I want to use Spearman's correlation (ρ) as a…