0

I have a few years of dataset and I am try to look at duration of events within the dataset. For example, I would like to know the duration of "Strong wind events". I can do this by:

wind.df <- data.frame(ws = c(6,7,8,9,1,7,6,1,2,3,4,10,4,1,2))
r <- rle(wind.df$ws>=6)
sequence <- unlist(sapply(r$lengths, seq))
wind.df$strong.wind.duration <- sequence

BUT, if the wind speed goes below a threshold for only two datapoints, I want to keep counting. If the wind speed is below a threshold for more than two, then I want to reset the counter.

So the output would look like:

## manually creating a desired output ###
wind.df$desired.output <- c(1,2,3,4,5,6,7,1,2,3,4,5,6,7,8)
Ayushi Kachhara
  • 159
  • 1
  • 9

1 Answers1

0

You can do this with a customized function that loops over your wind speeds and counts the consecutive numbers above a threshold:

numerate = function(nv, threshold = 6){
  counter = 1
  clist = c()
  low=TRUE
  for(i in 1:(length(nv))){
    if(max(nv[i:(i+2)],na.rm=T)<threshold & !low){ ## Reset the counter
      counter = 1
      low = T
    }
    if(nv[i]>=threshold){low=FALSE}
    clist=c(clist,counter)
    counter=counter+1
  }
  return(clist)
}

wind.df <- data.frame(ws = c(6,7,8,9,1,7,6,1,2,3,4,10,4,1,2))
wind.df$desired.output = numerate(wind.df$ws)

The output of this function would be:

> print(wind.df)
   ws desired.output
1   6              1
2   7              2
3   8              3
4   9              4
5   1              5
6   7              6
7   6              7
8   1              1
9   2              2
10  3              3
11  4              4
12 10              5
13  4              1
14  1              2
15  2              3

The desired output you wrote in your question is wrong, as the last three element of the wind speed are 4, 1, 2. That's more than two values below 6 after there was a value above 6. So, the counter has to be reset.

Martin Wettstein
  • 2,771
  • 2
  • 9
  • 15