2

I am trying to sum values from the same column across multiple rows in R, but each row falls into 1 or 2 of the desired output rows, so I have struggled to use ddply or tapply successfully.

I have triangular transect data, where sample points were taken at each vertex (points 1, 3 and 5) and half-way along each edge (points 2, 4 and 6). I am trying to summarise the data collected along each side of the triangle: i.e. leg A is the sum of points 1 + 2 + 3; leg B is the sum of points 3 + 4 + 5; leg C is the sum of points 5 + 6 + 1.

My data is in the form:

Transect <- c(rep("T001",6),rep("T002",6),rep("T003",6))
Point <- rep(seq(1,6,1),3)
Area <- c(rep(3000, 8), 2500, 2000, rep(3000,4), 1000, rep(3000,3))
df <- data.frame(Transect, Point, Area)

The desired output would be:

Transect2 <- c(rep("T001",3),rep("T002",3),rep("T003",3))
Leg <- rep(c("A", "B", "C"),3)
Total.Area <- c(rep(9000,3), 8500, 7500, 9000, 7000, 7000, 9000)
df.out <- data.frame(Transect2, Leg, Total.Area)

Thanks in advance for your help, and apologies if the question title is poorly worded, I'm not sure how to accurately describe this problem!

Andrew
  • 516
  • 1
  • 6
  • 17

1 Answers1

2

using dplyr and reshape2:

library(dplyr)
library(reshape2)

df %>% group_by(Transect) %>%
       summarise(A = sum(Area[Point %in% c(1, 2, 3)]),
                 B = sum(Area[Point %in% c(3, 4, 5)]),
                 C = sum(Area[Point %in% c(5, 6, 1)])) %>%
       melt()
jeremycg
  • 24,657
  • 5
  • 63
  • 74
  • This works beautifully, and I've managed to edit it and apply to a range of similar problems. The `%>%` notation was also new to me, and very useful. Thanks for your help! – Andrew Jul 06 '15 at 18:17
  • 1
    no problem, check out [this cheatsheet](http://www.rstudio.com/wp-content/uploads/2015/02/data-wrangling-cheatsheet.pdf) for more fancy data wrangling – jeremycg Jul 06 '15 at 18:20