0

considering this post: https://www.tidyverse.org/blog/2020/06/dplyr-1-0-0/

I was trying to create multiple models for a data set, using multiple formulas. this example says:

library(dplyr, warn.conflicts = FALSE)

models <- tibble::tribble(
  ~model_name,    ~ formula,
  "length-width", Sepal.Length ~ Petal.Width + Petal.Length,
  "interaction",  Sepal.Length ~ Petal.Width * Petal.Length
)

iris %>% 
  nest_by(Species) %>% 
  left_join(models, by = character()) %>% 
  rowwise(Species, model_name) %>% 
  mutate(model = list(lm(formula, data = data))) %>% 
  summarise(broom::glance(model))

You can see rowwise function is used to get the answer but when i dont use this function, i still get the correct answer

iris %>%
  nest_by(Species) %>% 
  left_join(models, by = character()) %>% 
  mutate(model = list(lm(formula, data = data))) %>% 
  summarise(broom::tidy(model))

i only lost the "model_name" column, but considering that rowwise documentation says, this function is to compute, i dont get why is still computed this way, why this happens?

thanks in advance.

Victor Espinoza
  • 318
  • 1
  • 9
  • I think that is is described in the documentation of `nest_by()`: nest_by() returns a rowwise data frame, which makes operations on the grouped data particularly elegant. See vignette("rowwise") for more details. – tmfmnk Jan 29 '21 at 18:47
  • Hi, as you say, `nest_by(Species)` give me a tibble with "Rowwise: Species" attribute, but ` rowwise(Species, model_name)` adds "model name" to that, computed values doesnt change. – Victor Espinoza Jan 29 '21 at 20:00

1 Answers1

0

considering https://cran.r-project.org/web/packages/dplyr/vignettes/rowwise.html

You can optionally supply “identifier” variables in your call to rowwise(). These variables are preserved when you call summarise(), so they behave somewhat similarly to the grouping variables passed to group_by():

i didn't understand how identifiers works, so as far i get this "identifiers" (Species,model_name) doesn't affect how to compute a value, only the way your tibble is presented.

So if you have a rowwise tibble created by nest_by you dont need the rowwise() function to compute by row. So in my example, rowwise function only give you a extra column of information but linear model is still the same. this is just for a "elegant way", it doesn't change the way its computed.

Thanks to tmfmnk

Victor Espinoza
  • 318
  • 1
  • 9