1

I have a dataframe that needs to be summarized by column B into one dataframe. I also need to summarize this dataframe by column A into another dataframe. For context's sake, column B is a subcolumn of column A in hierarchy. I also only need columns C:E, so I decided that dplyr would be the most helpful.

A  |  B  |  C  |  D  |  E  |  F |  G
-------------------------------------
1    1A     3     4     5     3    2
1    1B     4     4     4     4    3
2    2A     2     2     2     2    2
...

My team decided that a function would be the most efficient way to write this in order to achieve cleaner code. If I wanted to summarize the dataframe by column A, I know I would write the script to be something such as this:

df %>%
select(A, C, D, E) %>%
group_by(A) %>%
summarise(C = sum(C), D = sum(D), E = sum(E)

and B such as this:

df %>%
select(B, C, D, E) %>%
group_by(B) %>%
summarise(C = sum(C), D = sum(D), E = sum(E)

I am struggling to translate this into a function that works for either scenario. Here is what I have so far:

slicedata <- function(df, column_name){

df %>%
select(column_name, C, D, E) %>%
group_by(column_name) %>%
summarise(C = sum(C), D = sum(D), E = sum(E)

}

But when I pass column B as an argument in that function, this is what I get:

slicedata(df, B)
Error in .f(.x[[i]], ...) : object 'B' not found 

Basically: I am trying to write a function for this dataframe that allows me to aggregate the integer columns by whichever column I pass as an argument. I do not understand why this error is showing up, however.

AENick
  • 301
  • 1
  • 2
  • 8

1 Answers1

3

We can use enquo to convert it to a quosure and then evaluate with !!

slicedata <- function(df, column_name){
  column_name = enquo(column_name)
  df %>%
    select(!!column_name, C, D, E) %>%
    group_by(!!column_name) %>%
    summarise(C = sum(C), D = sum(D), E = sum(E)

  }

slicedata(df, B)
akrun
  • 874,273
  • 37
  • 540
  • 662