1

I am using the EPA NLA dataset to find the average temperature in the epiliminion for some lake data. The data set looks like this:

SITE  DEPTH   METALIMNION   TEMP FIELD
1      0.0       NA            25.6
1      0.5       NA            25.1
1      0.8       T             24.9
1      1.0       NA            24.1  
1      2.0       B             23.0
2      0.0       NA            29.0
2      0.5       T             28.0

"T" indicates the end of the epiliminion, and I want to average all corresponding temperature values including and above the "T" for each site. I have no idea where to even begin. (The "B" is irrelevant for this issue). Thanks!

atosbar
  • 35
  • 6

2 Answers2

0

With base R you can do it like this.

I use ave twice, the first time to determine where column METALIMNION has a "T", by group of SITE. This is vector g.

The second, average METALIMNION by SITE and that vector g.

g <- with(NLA, ave(as.character(METALIMNION), SITE, 
              FUN = function(x) {
                x[is.na(x)] <- ""
                rev(cumsum(rev(x) == "T"))
                }))
NLA$AVG <- ave(NLA$TEMP.FIELD, NLA$SITE, g)

NLA
#  SITE DEPTH METALIMNION TEMP.FIELD   AVG
#1    1   0.0        <NA>       25.6 25.20
#2    1   0.5        <NA>       25.1 25.20
#3    1   0.8           T       24.9 25.20
#4    1   1.0        <NA>       24.1 23.55
#5    1   2.0           B       23.0 23.55
#6    2   0.0        <NA>       29.0 28.50
#7    2   0.5           T       28.0 28.50
Rui Barradas
  • 70,273
  • 8
  • 34
  • 66
0

Assuming that there is only one 'T' for each value of site, using dplyr package:

library(dplyr)

data.frame(SITE=c(1,1,1,1,1,2,2),TEMP=c(25.6,25.1,24.9,24.1,23.0,29.0,28.0)) %>%
    group_by(SITE) %>%
    summarise(meanTemp=mean(TEMP))

Result:

# A tibble: 2 x 2
   SITE meanTemp
  <dbl>    <dbl>
1     1     24.5
2     2     28.5
Richard J. Acton
  • 885
  • 4
  • 17