3

I'm trying to run a multinomial logistic regression in R using tidymodels but I can't convert my results to a tidy object. Here's a sample using the iris data set.

# Multinomial  -----------------------------------------------------------------
# recipe
multinom_recipe <-
  recipe(Species ~ Sepal.Length + Sepal.Width + Sepal.Length + Petal.Width, data = iris) %>% 
  step_relevel(Species, ref_level = "setosa")

# model 
multinom_model <-  multinom_reg() %>% 
  set_engine("nnet")

# make workflow
multinom_wf <- 
  workflow() %>% 
  add_model(multinom_model) %>% 
  add_recipe(multinom_recipe) %>% 
  fit(data = iris) %>% 
  tidy()

multinom_wf

The last step throws the following error:

Error in eval(predvars, data, env) : object '..y' not found

I thought it was bc the output of the fit(data = iris) is a workflow object, but this code seems to work fine when I don't use workflow (which is the whole point of using tidymodels) or if I fit a linear model.

# recipe
linear_recipe <-
  recipe(Sepal.Length ~ Sepal.Width + Sepal.Length + Petal.Width, data = iris) 

# model 
linear_model <-  linear_reg() %>% 
  set_engine("lm")

# make workflow
linear_wf <- 
  workflow() %>% 
  add_model(linear_model) %>% 
  add_recipe(linear_recipe) %>% 
  fit(data = iris) %>% 
  tidy()

linear_wf

Anyone have an idea as to what I'm missing or is this a bug?

kaseyzapatka
  • 149
  • 2
  • 9

2 Answers2

2

It could be a clash with the call. We could change it to

multinom_wf$fit$fit$fit$call <- quote(nnet::multinom(formula = Species ~ ., data = iris, trace = FALSE))
multinom_wf  %>%
     tidy

-output

# A tibble: 8 x 6
  y.level    term         estimate std.error statistic p.value
  <chr>      <chr>           <dbl>     <dbl>     <dbl>   <dbl>
1 versicolor (Intercept)      4.17      12.0    0.348  0.728  
2 versicolor Sepal.Length     1.08      42.0    0.0258 0.979  
3 versicolor Sepal.Width     -9.13      81.5   -0.112  0.911  
4 versicolor Petal.Width     20.9       14.0    1.49   0.136  
5 virginica  (Intercept)    -16.0       12.1   -1.33   0.185  
6 virginica  Sepal.Length     2.37      42.0    0.0563 0.955  
7 virginica  Sepal.Width    -13.9       81.5   -0.171  0.864  
8 virginica  Petal.Width     36.8       14.1    2.61   0.00916

where

multinom_wf <- 
  workflow() %>% 
  add_model(multinom_model) %>% 
  add_recipe(multinom_recipe) %>% 
  fit(data = iris)
akrun
  • 874,273
  • 37
  • 540
  • 662
  • 1
    Awesome. My next step will be to learn about expressions! – TarJae Jul 26 '21 at 21:28
  • 1
    @akrun, so it sounds like this is a bug then – kaseyzapatka Jul 26 '21 at 21:54
  • @akrun and @TarJae. Thanks for your help, but it didn't solve my issue. I think it has something to do with the fact that its a multinomial regression and therefore has more than 1 outcome category and the function is not grabbing all of them? It throws the same error every time that `Error in eval(predvars, data, env) : object '..y' not found`. This is a somewhat similar issue:https://stackoverflow.com/questions/50539633/error-in-evalpredvars-data-env-object-rm-not-found – kaseyzapatka Jul 27 '21 at 20:46
  • @kaseyzapatka So you are still getting the error even with the fix I provided? – akrun Jul 27 '21 at 20:48
  • Thanks, @akrun. I did just get it to run after running into an error at first. It seems this is a bug in the call, right? – kaseyzapatka Jul 27 '21 at 21:09
  • @kaseyzapatka it seems to be a bug. I think different model call have some issues i.e. I noticed `glm` have some issues as well. – akrun Jul 27 '21 at 21:11
  • @akrun i reported it as an issue on GitHub. I found it to be an issue only when the engine specification is set to net, but not when it's set to glmnet. Weird. I hope they can fix it. Thanks – kaseyzapatka Jul 27 '21 at 21:15
  • @kaseyzapatka i hope so. With lots of models around, this would be a challenge. Wouldn't this be more efficient if you construct it outside – akrun Jul 27 '21 at 21:17
  • 1
    @akrun, yes I'm moving away from using `tidy models` workflow but it seems to be good practice when you have a lot of models. Unfortunately, it's more cumbersome than it's worth here – kaseyzapatka Jul 28 '21 at 16:08
2

We have a function repair_call() in parsnip to fix up the call objects for packages that don't play nicely with "typical" norms; read more about it here.

library(tidymodels)
#> Registered S3 method overwritten by 'tune':
#>   method                   from   
#>   required_pkgs.model_spec parsnip

multinom_model <-  multinom_reg() %>% 
   set_engine("nnet")

nnet_fit <- 
   multinom_model %>%
   fit(Species ~ Sepal.Length + Sepal.Width + Sepal.Length + Petal.Width, data = iris)

tidy(nnet_fit)
#> Error in model.frame.default(formula = Species ~ Sepal.Length + Sepal.Width + : 'data' must be a data.frame, environment, or list

nnet_fixed <- repair_call(nnet_fit, data = iris)
tidy(nnet_fixed)
#> # A tibble: 8 × 6
#>   y.level    term         estimate std.error statistic p.value
#>   <chr>      <chr>           <dbl>     <dbl>     <dbl>   <dbl>
#> 1 versicolor (Intercept)      4.17     260.     0.0160   0.987
#> 2 versicolor Sepal.Length     1.08      64.8    0.0167   0.987
#> 3 versicolor Sepal.Width     -9.13      80.4   -0.114    0.910
#> 4 versicolor Petal.Width     20.9       98.1    0.213    0.831
#> 5 virginica  (Intercept)    -16.0      261.    -0.0616   0.951
#> 6 virginica  Sepal.Length     2.37      64.8    0.0365   0.971
#> 7 virginica  Sepal.Width    -13.9       80.4   -0.173    0.862
#> 8 virginica  Petal.Width     36.8       98.2    0.375    0.708

Created on 2021-08-01 by the reprex package (v2.0.0)

Julia Silge
  • 10,848
  • 2
  • 40
  • 48