0

I have a function that takes in a dataset and a variable as argument. It needs to filter based on some criteria, pull the column with name as variable and calculate its mean then. I am not able to pass the variable as column name though. Kindly help.

MeanFor <- formula(df, flag, var){
  df2 <- df %>% filter(member == flag) %>% pull(var) %>% mean(.)
}

My df looks like this

df <- data.frame(name = c("A","B","C","D"),
member = c("Y","Y","Y","N"), sales_jan = c("100","0","130","45"), sales_feb = c("44","0","67","0"))

I can provide the flag as "Y"/"N" and want to provide "sales_jan"/"sales_feb"/.../"sales_dec" as var input.

markus
  • 25,843
  • 5
  • 39
  • 58
FinRC
  • 133
  • 8
  • Take a look at [Programming with dplyr](https://dplyr.tidyverse.org/articles/programming.html) – markus Mar 17 '21 at 08:49
  • Also a) you want "function", not "formula". Also, why are you formatting your sales as character, not as numeric? – deschen Mar 17 '21 at 08:53
  • And `pull()` creates a vector and `mean` gives a scalar, yet `df2` suggests you want a `data.frame`. And your formula [sic] has no return value... – Limey Mar 17 '21 at 08:55

1 Answers1

2

You can write the function as :

library(dplyr)

MeanFor <- function(df, flag, var){
   df %>% filter(member == flag) %>% pull(.data[[var]]) %>% as.numeric() %>% mean
}
df %>% MeanFor('Y', 'sales_jan')
#[1] 76.66667

df %>% MeanFor('Y', 'sales_feb')
#[1] 37

The function can be written in base R as :

MeanFor <- function(df, flag, var){
    mean(as.numeric(subset(df, member == flag)[[var]]))
}
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213