0

I have a dataframe with different parameters in each. I'll like to merge rows using a different set of parameters for each row. Here is my sameple data ZZ:

ZZ<-data.frame(Name =c("A","B","C","D","E","F"),A1=c(19,20,21,23,45,67),A2=c(1,2,3,4,5,6),A3=c(7,8,13,24,88,90),x=c(4,5,6,8,23,16),y=c(-3,-7,-6,-9,3,2))

> ZZ
  Name A1 A2 A3  x  y
1    A 19  1  7  4 -3
2    B 20  2  8  5 -7
3    C 21  3 13  6 -6
4    D 23  4 24  8 -9
5    E 45  5 88 23  3
6    F 67  6 90 16  2

I want to aggregate the rows A,B,C and D,E,F such that a new name is defined for each group (eg:C1 and C2), A1,A2 and A3 are combined by sum while x and y using the mean.

How can this be done please? The result should be:

> ZZ2
  Name  A1 A2  A3      x      y
1   C1  60  6  28  5.000 -5.333
2   C2 135 15 202 15.667 -1.333
Joke O.
  • 515
  • 6
  • 29

1 Answers1

1

Based on how I interpreted your question I believe this should give you what you want using dplyr:

library(dplyr)

result <- ZZ %>% 
  mutate(Name = ifelse(Name %in% c("A", "B", "C"), "C1", "C2")) %>% 
  group_by(Name) %>% 
  summarise(A1 = sum(A1), A2 = sum(A2), A3 = sum(A3), x = mean(x), y = mean(y)) %>% 
  ungroup()

Depending on how many rows you have with different names there might be better alternatives for the mutating the Name variable into the 2 groups.

EDIT: Example if 4 cases exist

 result <- ZZ %>% 
   mutate(Name = case_when(Name %in% c("A", "B", "C") ~ "C1",
                           Name %in% c("D", "E") ~ "C2",
                           Name %in% c("F", "G") ~ "C3",
                           Name %in% c("H", "I") ~ "C4")) %>% 
   group_by(Name) %>% 
   summarise(A1 = sum(A1), A2 = sum(A2), A3 = sum(A3), x = mean(x), y = mean(y)) %>% 
   ungroup()
Amanda
  • 506
  • 2
  • 6
  • @JokeO. that matches the output of the code above, is there something you found wrong in it? If your concern is that it's a tibble you can do `as.data.frame(result)` – Amanda Jun 20 '18 at 17:29
  • Hi Amanda. No. I'll try it and respond. Someone asked for a sample output. – Joke O. Jun 20 '18 at 18:58
  • This worked. If I had a bigger data-frame and wanted a result with 4 different names grouping different number of rows per name, how would this be done? – Joke O. Jun 20 '18 at 19:04
  • @JokeO. you can use a case_when, I'll update the answer to show an example of this. – Amanda Jun 20 '18 at 19:31