0

Here is an example data frame

ID    Var1    Var2    Var3 ....... Var85
A      3       2        1            3
B      1       3        1            2
A      2       1        1            1
A      1       2        2            1
C      3       1        3            2
C      2       1        2            1
B      1       3        3            1

I want to create the following, basically summing up rows based on ID

ID    Var1    Var2    Var3 ....... Var85
A      6       5        4            5
B      2       2        4            3
C      5       6        5            3

I found a solution for only a single variable using the dplyr, but I know how to implement that with multiple columns

df <- df %>% group_by(ID) %>% summarise(Var1 = sum(Var2)) %>% as.data.frame()

I thought of implementing the following via a loop, but I am hoping for a much simpler solution.

Marble
  • 125
  • 1
  • 9
  • 2
    Use `across` inside `summarise`: `df %>% group_by(ID) %>% summarise(across(Var1:Var85, ~ sum(.x)))` – PaulS Jun 16 '22 at 20:18
  • This was a example dataframe, My actual dataframe doesn't have the colnames in such a good representation – Marble Jun 16 '22 at 20:43
  • I can use your solution, by doing the following, export the colnames out in a vector, then rename the colnames as above, then apply the solution and then import back the colnames. – Marble Jun 16 '22 at 20:44
  • Well, without seeing a piece of your dataset, it is hard to offer a better solution. – PaulS Jun 16 '22 at 20:51
  • The dataset is not much different, instead of the column names as c(Var1:Var85), they look like c(alpha, beta, gamma, delta, .........., omega ) – Marble Jun 16 '22 at 21:09
  • If these columns of yours are sequential, then use: `df %>% group_by(ID) %>% summarise(across(alpha:omega, ~ sum(.x)))`. – PaulS Jun 16 '22 at 21:14
  • this isn't working, the error – Marble Jun 16 '22 at 21:22
  • Error in `dplyr::summarise()`: ! Problem while computing `..1 = across(everything() ~ sum(.x))`. ℹ The error occurred in group 0: character(0). Caused by error in `across()`: ! Must supply a column selection. ℹ You most likely meant: `across(everything(), everything() ~ sum(.x))`. ℹ The first argument `.cols` selects a set of columns. ℹ The second argument `.fns` operates on each selected columns. Run `rlang::last_error()` to see where the error occurred. – Marble Jun 16 '22 at 21:22
  • Post the `colnames` of your dataframe. – PaulS Jun 16 '22 at 21:28
  • "ID", "X1.16" ,"X1.16.1","X1.16.2","X1.17" "X1.17.1" "X1.17.2" "X1.17.3" , "X1.18"........ and so on – Marble Jun 16 '22 at 21:34
  • This works: `df <- data.frame( ID = c("A", "B", "A", "A", "C", "C", "B"), X1.16 = c(3L, 1L, 2L, 1L, 3L, 2L, 1L), X1.16.1 = c(2L, 3L, 1L, 2L, 1L, 1L, 3L), X1.16.2 = c(1L, 1L, 1L, 2L, 3L, 2L, 3L), X1.17 = c(3L, 2L, 1L, 1L, 2L, 1L, 1L) ) df %>% group_by(ID) %>% summarise(across(X1.16:X1.17, ~ sum(.x)))` – PaulS Jun 16 '22 at 21:38
  • Alternatively, you can use `across(2:5` instead of `across(X1.16:X1.17`. – PaulS Jun 16 '22 at 21:40
  • I see that the class of the columns of my data are factors, is it the reason it is problemetic, and your dataset is numeric – Marble Jun 16 '22 at 21:43
  • 1
    So, do this: `df %>% group_by(ID) %>% summarise(across(X1.16:X1.17, ~ sum(.x %>% as.numeric)))`. `as.numeric` will convert your factors to numbers before summing. – PaulS Jun 16 '22 at 21:47
  • you meant only df %>% group_by(ID) %>% summarise(across(X1.16:X1.17, ~ sum(.x %>% as.numeric))),......which works perfectly – Marble Jun 16 '22 at 21:56
  • 1
    thanks a lot for helping out, should have cross-checked my classes in the dataframe @PaulS – Marble Jun 16 '22 at 21:57
  • Excellent, Maharnab Naha! – PaulS Jun 16 '22 at 22:02

0 Answers0