0

I have the following dataframe

a        <- c(rep("CGR", 6), rep("AUS", 6), rep("ROW", 6) )
b        <- c("AUT", "CH", "ROW", "ROW", "ROW", "ROW", "AUT", "CH", "ROW", "ROW", "ROW", "ROW", "AUT", "CH", "ROW", "ROW", "ROW", "ROW" )
v        <- 1:18
category <- c("a", "b", "a", "a", "b", "b", "a", "b", "a", "a", "b", "b", "a", "b", "a", "a", "b", "b")

data.frame(a,b,v,category)


     a   b  v category
1  CGR AUT  1        a
2  CGR  CH  2        b
3  CGR ROW  3        a
4  CGR ROW  4        a
5  CGR ROW  5        b
6  CGR ROW  6        b
7  AUS AUT  7        a
8  AUS  CH  8        b
9  AUS ROW  9        a
10 AUS ROW 10        a
11 AUS ROW 11        b
12 AUS ROW 12        b
13 ROW AUT 13        a
14 ROW  CH 14        b
15 ROW ROW 15        a
16 ROW ROW 16        a
17 ROW ROW 17        b
18 ROW ROW 18        b

What I want to do is to create a new dataframe or collapse the existing one where I want to sum the values in 'v' when column vector 'b' is equal to `row' by the categories of the column vector 'category'

The expected result should look like this:

 aa  bb vv category
1  CGR AUT  1         a
2  CGR  CH  2         b
3  CGR ROW  7         a
4  CGR ROW 11         b
5  AUS AUT  7         a
6  AUS  CH  8         b
7  AUS ROW 19         a
8  AUS ROW 23         b
9  ROW AUT 13         a
10 ROW  CH 14         b
11 ROW ROW 31         a
12 ROW ROW 35         b

Can someone just something efficient to perform this task ?

msh855
  • 1,493
  • 1
  • 15
  • 36

1 Answers1

6
# example dataset
a        <- c(rep("CGR", 6), rep("AUS", 6), rep("ROW", 6) )
b        <- c("AUT", "CH", "ROW", "ROW", "ROW", "ROW", "AUT", "CH", "ROW", "ROW", "ROW", "ROW", "AUT", "CH", "ROW", "ROW", "ROW", "ROW" )
v        <- 1:18
category <- c("a", "b", "a", "a", "b", "b", "a", "b", "a", "a", "b", "b", "a", "b", "a", "a", "b", "b")

df1 = data.frame(a,b,v,category)

library(dplyr)

df1 %>%
  group_by(a,b,category) %>%
  summarise(vv = sum(v)) %>%
  ungroup()

# # A tibble: 12 x 4
#        a      b category    vv
#   <fctr> <fctr>   <fctr> <int>
# 1    AUS    AUT        a     7
# 2    AUS     CH        b     8
# 3    AUS    ROW        a    19
# 4    AUS    ROW        b    23
# 5    CGR    AUT        a     1
# 6    CGR     CH        b     2
# 7    CGR    ROW        a     7
# 8    CGR    ROW        b    11
# 9    ROW    AUT        a    13
# 10   ROW     CH        b    14
# 11   ROW    ROW        a    31
# 12   ROW    ROW        b    35
AntoniosK
  • 15,991
  • 2
  • 19
  • 32
  • 1
    thank you, "patrida"- it worked. One more thing that I forgot is what if there there are more values to summarize ? so say v and v1 ?? – msh855 Dec 12 '17 at 15:02
  • You can explicitly name them like `vv = sum(v), vv1 = sum(v1)`, or try and use `summarise_at`, or `summarise_each`. – AntoniosK Dec 12 '17 at 16:02