1

I am trying perform dplyr summarize iteratively using concatenated string as column names

Category=c("a","a","b","b","b","c","c","c")
A1=c(1,2,3,4,3,2,1,2)
A2=c(10,11,12,13,14,15,16,17)
tt=cbind(Category,A1,A2)
tdat=data.frame(tt)
colnames(tdat)=c("Category","M1","M2")
ll=matrix(1:2,nrow=2)
for(i in 1:nrow(ll)) {
  Aone=tdat %>% group_by(Category) %>%
    summarize(Msum=sum(paste("M",i,sep="")))
}

I end up the following error

x invalid 'type' (character) of argument
ℹ Input Msum is sum(paste("M", i, sep = "")).
ℹ The error occurred in group 1: Category = "A".
Run rlang::last_error() to see where the error occurred.```


The goal is to iteratively get arithmentic functions within summarize function in dplyr. But this concatenated string is not recognized as column name. 
Vignesh
  • 912
  • 6
  • 13
  • 24
  • 1
    Your code returns an error `Error in cbind(Category, A1, A2) : object 'Category' not found`. Also there is no `A1`, `A2` defined when you do `cbind(Category,A1,A2)`. Could you correct the code so that your post is reproducible ? It would also be helpful if you show expected output for the data shared. – Ronak Shah Dec 19 '20 at 08:29
  • What is the `Category` column. – akrun Dec 24 '20 at 17:21
  • Just updated the code – Vignesh Dec 24 '20 at 17:23
  • I updated the post. It is working for me – akrun Dec 24 '20 at 17:29

1 Answers1

2

If we want to pass a string as column name, then convert to symbol and evaluate (!!)

library(dplyr)
Aone <- vector('list', nrow(ll))
for(i in seq_len(nrow(ll))) {
      Aone[[i]] <- tdat %>%
                    group_by(Category) %>%
                    summarize(Msum = sum(!! rlang::sym(paste("M", i, sep=""))))
    }

Or assuming the column name is 'M-1', 'M-2', etc, it should work as well

Aone <- vector('list', 2)
for(i in seq_along(Aone)) {
   Aone[[i]] <- tdat %>%
        group_by(Category) %>% 
       summarise(Msum = sum(!! rlang::sym(paste("M-", i, sep=""))), 
         .groups = 'drop')
  }

NOTE: The ll was not clear in the original post. Here, we create a list with length equal to the number of 'M-' columns and assign the output back to the list element by looping over the sequence of that list

data

tdat <- data.frame(Category, M1, M2)


tdat <- structure(list(Category = c("A", "A", "A", "A", "B", "B", "B", 
"B"), `M-1` = c(1, 2, 3, 4, 3, 2, 1, 2), `M-2` = c(10, 11, 12, 
13, 14, 15, 16, 17)), class = "data.frame", row.names = c(NA, 
-8L))
akrun
  • 874,273
  • 37
  • 540
  • 662