1

I am trying to find dates of yearly instances exceeding multi-condition thresholds in time series data. I used the rollapply function to identify the first date each year (lets call this date A) when the daily temperature exceeds B (B >= 5.0; Mean_Temp) for C days (C = 5 days; width).

Sample Data:

    DOY Year    Month   Day Mean_Temp
    96  1960    4       5   1.5
    97  1960    4       6   -1
    98  1960    4       7   -1.9
    99  1960    4       8   -2.3
   100  1960    4       9   1.3
   101  1960    4       10  -0.5
   102  1960    4       11  5.9
   103  1960    4       12  5.7
   104  1960    4       13  5.3
   105  1960    4       14  6.1
   106  1960    4       15  9.9

Sample Code:

Table <- data %>% group_by(Year) %>%
mutate(Mean_Temp=rollapply(Mean_Temp, width=5, min, align="right", fill=NA, na.rm=TRUE)) %>%
filter(Mean_Temp >=5.0) %>%
filter(row_number() == 1)

Sample Output:

        X       Year    Month   Day Mean_Temp
    1   106     1960    4       15  5.3
    2   466     1961    4       10  5.6
    3   830     1962    4       9   5.6
    4   1205    1963    4       19  5.6
    5   1561    1964    4       9   5.6
    6   1948    1965    5       1   7.8

    

However, I would now like to find any instances (if they occur) of temperatures below a new threshold, X, for Y days, within Z days of A (e.g. 1960-04-15). For example, when does the temp drop below -15 within 28 days after the dates above?

The kind of output I am looking for would be something like:

    Year    Month   Day Mean_Temp
    1960    4       16  -16.1
    1960    4       19  -17.2
    1961    4       14  -15.2
    1961    4       15  -15.1
    1963    4       30  -16.7
    1963    5       1   -17.1
    1964    4       16  -15.3
    1964    4       17  -16.3

I am wondering about using the output from my rollapply function to indicate the starting dates each year (A) to monitor the temperature for the next C days to see if it drops below B. However, I am a little lost as to how to code that type of function, essentially looping through daily temperature data each year after a given date (presumably referenced from a separate table) watching for a certain temperature threshold.

Here is a sample of the data.

structure(list(X = 1:20, Year = c(1960L, 1960L, 1960L, 1960L, 1960L, 1960L, 1960L, 1960L, 1960L, 1960L, 1960L, 1960L, 1960L, 1960L, 1960L, 1960L, 1960L, 1960L, 1960L, 1960L), Month = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), Day = 1:20, Mean_Temp = c(-12.2, -10, -2.3, -4.2, -7.2, -12.3, -6.1, -5, -12.5, -9.2, -9.2, -6.7, -4.2, -6.1, -4.7, -6.7, -6.1, -6.7, -8.1, -7.8)), row.names = c(NA, 20L), class = "data.frame")
MacOS
  • 1,149
  • 1
  • 7
  • 14
b.coleman
  • 11
  • 2
  • 1
    Welcome to Stackoverflow! Can you please use ```dput``` on your data frame and paste the output of it here? Thank you! – MacOS Nov 27 '20 at 20:53
  • 1
    Hi and thank you! Apologies for omitting this portion of the post: structure(list(X = 1:20, Year = c(1960L, 1960L, 1960L, 1960L, 1960L, 1960L, 1960L, 1960L, 1960L, 1960L, 1960L, 1960L, 1960L, 1960L, 1960L, 1960L, 1960L, 1960L, 1960L, 1960L), Month = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), Day = 1:20, Mean_Temp = c(-12.2, -10, -2.3, -4.2, -7.2, -12.3, -6.1, -5, -12.5, -9.2, -9.2, -6.7, -4.2, -6.1, -4.7, -6.7, -6.1, -6.7, -8.1, -7.8)), row.names = c(NA, 20L), class = "data.frame") – b.coleman Nov 27 '20 at 21:43
  • Thank you for posting the output! However, can you please give a more detailed example of what you are looking for? I posted a preliminary solution. Please have a look. – MacOS Nov 28 '20 at 10:40

1 Answers1

0

I do not know what you are exactly looking for, but maybe the following is in the right direction.

MEAN.TEMPERATURE.THRESHOLD <- -5.0

df <- structure(list(X = 1:20, Year = c(1960L, 1960L, 1960L, 1960L, 1960L, 1960L, 1960L, 1960L, 1960L, 1960L, 1960L, 1960L, 1960L, 1960L, 1960L, 1960L, 1960L, 1960L, 1960L, 1960L), Month = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), Day = 1:20, Mean_Temp = c(-12.2, -10, -2.3, -4.2, -7.2, -12.3, -6.1, -5, -12.5, -9.2, -9.2, -6.7, -4.2, -6.1, -4.7, -6.7, -6.1, -6.7, -8.1, -7.8)), row.names = c(NA, 20L), class = "data.frame")

df <- subset(df, select = -X)


df$belowThreshold <- ifelse(df$Mean_Temp < MEAN.TEMPERATURE.THRESHOLD, TRUE, FALSE)

df$cumSumBelowThreshold <- with(df,
                                ave(belowThreshold,
                                    cumsum(belowThreshold == 0),
                                    FUN = cumsum))

df
MacOS
  • 1,149
  • 1
  • 7
  • 14
  • I'm not sure I was clear enough in my example. My code (above) iterates through yearly temperature data to find first instance of 5 consecutive days >= mean temp of 5 oC: structure(list(X = c(115L, 496L, 850L, 1205L, 1581L, 1948L, 2302L, 2693L, 3013L, 3395L), Year = 1960:1969, Month = c(4L, 5L, 4L, 4L, 4L, 5L, 4L, 5L, 3L, 4L), Day = c(24L, 10L, 29L, 19L, 29L, 1L, 20L, 16L, 31L, 17L), Mean_Temp = c(7.5, 5, 15, 5, 5.3, 5.9, 5, 5, 5.6, 8.6)), row.names = c(NA, 10L), class = "data.frame") I am hoping to find instances of the mean temp < -5 oC within 28 days of those dates in my df. – b.coleman Nov 28 '20 at 16:41