Questions tagged [tidymodels]

The tidymodels framework is a collection of R packages for modeling and machine learning using tidyverse principles.

The tidymodels framework is a "meta-package" for modeling and statistical analysis that shares the underlying design philosophy, grammar, and data structures of the . It includes a core set of packages that are loaded on startup, and extra packages that are installed along with tidymodels but not attached on startup. The tidymodels framework provides tooling for modeling tasks including supervised machine learning (predictive modeling), unsupervised machine learning, time series analysis, text analysis, and more.

Resources

Related tags

613 questions
0
votes
0 answers

axis ticks of autoplot object

In the example to compare plot model results from many different resampled tidymodels, how can I change the labels of the x-axis ticks? autoplot( grid_results, rank_metric = "rmse", # <- how to order models metric = "rmse", # <-…
S Front
  • 333
  • 1
  • 8
0
votes
1 answer

passing a list of variables to recipe in tidymodels causes model error

I have a simple recipe to train a model. My categorical variables are changing over time and sometimes I want a numerical to be treated as categorical (postal code) , so I define a list prior to recipe containing them. (just for the sake of the…
Forge
  • 1,587
  • 1
  • 15
  • 36
0
votes
1 answer

Tidymodels: Slow hyperparameter tuning with data with multiple predictors

I'm presently trying to fit a random forest model with hyperparameter tuning using the tidymodels framework on a dataframe with 101,064 rows and 64 columns. I have a mix of categorical and continuous predictors and my outcome variable is a…
0
votes
1 answer

Error: "Missing data in columns" when using tidymodels workflow to predict test set

Recently I learn to use tidymodels to build up machine learning workflow, but when I use the workflow to make the prediction on test set, it raises the error "Missing data in columns", but I am sure that neither the train and the test set has…
Kim.L
  • 121
  • 10
0
votes
1 answer

Tidymodels: Impute missing values in a Date column?

this question is a duplicate of Tidymodels: What is the correct way to impute missing values in a Date column? As the question was closed I provided a reprex and raise the question again. I struggle a bit with missing values in a Date column. In my…
Mischa
  • 137
  • 8
0
votes
1 answer

extract feature importance in penalized logistc regression model

I use this example from the tidymodels website for my own data ( https://www.tidymodels.org/start/case-study/ ). In contrast to this example, my data demonstrate that penalized logistic regression outperforms random forest in terms of…
Dite Bayu
  • 31
  • 6
0
votes
1 answer

Tidymodels / XGBoost error in last_fit with rsplit value

I am trying to follow this tutorial here - https://juliasilge.com/blog/xgboost-tune-volleyball/ I am using it on the most recent Tidy Tuesday dataset about great lakes fishing - trying to predict agency based on many other values. ALL of the code…
Indescribled
  • 320
  • 1
  • 10
0
votes
1 answer

Tidymodels: What is the correct way to impute missing values in a Date column?

I struggle a bit with missing values in a Date column. In my pre-processing pipeline (recipe-object) I used the step_impute_knn function to fill missing values in all my Date columns. Unfortunately I got the following error: Assigned data pred_vals…
Mischa
  • 137
  • 8
0
votes
1 answer

Error / Warning "There are new levels in a factor: NA"

I am working on creating a Random Forest model using the tidymodels approach. In the recipe function, I get this error/warning that I simply cannot interpret, but it must be something related to the summary variables created. The error is There are…
0
votes
1 answer

tidymodel error, when calling predict function is asking for target variable

I have trained a churn tidymodel with customer data (more than 200 columns). Got a fairly good metrics using xgbboost but the issue is when tryng to predict on new data. Predict function asks for target variable (churn) and I am a bit confused as…
Forge
  • 1,587
  • 1
  • 15
  • 36
0
votes
1 answer

Parameters for hyperparameter grid search functions in tidymodels tuning

I’m using {workflowsets} from {tidymodels} for the first time, and I’m following along chapter in Tidy Modeling with R. In the book, the authors use a fixed, regular grid hyperparameter search: grid_results <- all_workflows %>% workflow_map( …
Marco B
  • 121
  • 6
0
votes
1 answer

How do I see the selected k from parsnip::nearest_neighbor()

If I fit a fit a k nearest neighbors model using parsnip::nearest_neighbor(), what k is selected if I don't specify how to tune? I am trying to figure out what k is selected here: the_model <- nearest_neighbor() %>% set_engine("kknn") %>% …
itsMeInMiami
  • 2,324
  • 1
  • 13
  • 34
0
votes
1 answer

Error when trying to add a recipe column to a tibble

While trying to add a recipe column to a tibble, following the steps of this Rsample Tidymodels article, I got the following error message: Error: Not all variables in the recipe are present in the supplied training set: 'ticker', 'ret_3m',…
0
votes
1 answer

Plotting issues -Partial dependence plots

The following explain_tidymodels is created, to to display partial dependence plots. explainer <- explain_tidymodels(rf_vi_fit, data = Data_train, y = Data_train$Lead_week) Now i'm creating plots by doing the following: model_profile(explainer,…
Kylian
  • 319
  • 2
  • 14
0
votes
1 answer

Error while creating workflow from a recipie using linear models in R

I am training a linear regression model predicting salary from company size (company_size_number) and country (country) using the StackOverflow data. What I perform is: Read the data. Split the data into a training set (75%) and a test set…
Ranji Raj
  • 778
  • 4
  • 18