I have a dataset that includes many "transects", and a multiple "transects" comprise a "plane" (e.g. Plane P1 = Transect T1 + Transect T2)
The current data structure (see example below) has the length of each transect repeated in the column "transect length" for each sample that was collected in the transect. I need to sum the transect lengths to get the "plane length" for EACH plane.
Here is what the data structure I've been given looks like:
Plane name | Transect name | Transect length | Sample |
---|---|---|---|
P1 | T1 | 1002 | x |
P1 | T1 | 1002 | y |
P1 | T2 | 1034 | z |
P1 | T2 | 1034 | a |
P1 | T2 | 1034 | b |
P2 | T3 | 1222 | c |
P2 | T3 | 1222 | d |
P2 | T4 | 1324 | e |
I added an "index" column, thinking there must be some way I can add the transect lengths when the index for that row = 1.
data3 <- data2 %>%
group_by(transect)%>%
mutate(index = seq(n())) %>%
ungroup() %>%
Table showing the added index column:
Plane name | Transect name | Transect length | Sample | Index |
---|---|---|---|---|
P1 | T1 | 1002 | x | 1 |
P1 | T1 | 1002 | y | 2 |
P1 | T2 | 1034 | z | 1 |
P1 | T2 | 1034 | a | 2 |
P1 | T2 | 1034 | b | 3 |
P2 | T3 | 1222 | c | 1 |
P2 | T3 | 1222 | d | 2 |
P2 | T4 | 1324 | e | 1 |
But now I am stuck on how to add only the transect lengths with index = 1 for a certain plane, and iterate over all the planes...
So for example, in the end, what I need it to look like is this:
Plane name | Transect name | Transect length | Sample | Index | Plane length |
---|---|---|---|---|---|
P1 | T1 | 1002 | x | 1 | 2036 |
P1 | T1 | 1002 | y | 2 | 2036 |
P1 | T2 | 1034 | z | 1 | 2036 |
P1 | T2 | 1034 | a | 2 | 2036 |
P1 | T2 | 1034 | b | 3 | 2036 |
P2 | T3 | 1222 | c | 1 | 2546 |
P2 | T3 | 1222 | d | 2 | 2546 |
P2 | T4 | 1324 | e | 1 | 2546 |
Notice the plane length for plane P1 is the sum of only the T1 transect lengths that have index = 1. So: 1002 + 1034 = 2036, NOT 1002 + 1002 + 1034 + 1034 + 1034 = 5106.
I tried several variations of code using mutate, sum, summarize, etc. with no luck:
plane_lengths <- data3 %>%
mutate(plane_length = sum(transect_length[index ==1]))
^ this one seems to do nothing. :(
plane_lengths <- data3 %>% group_by(plane_name) %>% summarize(plane_length = sum(transect_length))
^ this one just adds every value of transect length (all index values, not just index = 1)
Is there some way to modify the code above to take only values for transect length where the index = 1?
I am still a bit new to R, and I really appreciate any help I can get!!! Thank you.