Consider this simple example:
library(dplyr)
library(broom)
dataframe <- data_frame(id = c(1,2,3,4,5,6),
group = c(1,1,1,2,2,2),
value = c(200,400,120,300,100,100))
# A tibble: 6 x 3
id group value
<dbl> <dbl> <dbl>
1 1 1 200
2 2 1 400
3 3 1 120
4 4 2 300
5 5 2 100
6 6 2 100
Here I want to group by group
and create two columns.
One is the number of distinct values in value
(I can use dplyr::n_distinct
), the other is the constant term from a regression of value
on the vector 1. That is, the output of
tidy(lm(data = dataframe, value ~ 1)) %>% select(estimate)
estimate
1 203.3333
The difficulty here is combining these two simple outputs into a single mutate
statement that preserves the grouping.
I tried something like:
formula1 <- function(data, myvar){
tidy(lm(data = data, myvar ~ 1)) %>% select(estimate)
}
dataframe %>% group_by(group) %>%
mutate(distinct = n_distinct(value),
mean = formula1(., value))
but this does not work. What I am missing here? Thanks!