0

Hi I have a question some of you could help me out.

My vector is temperature collected hourly:

a<-c(7.95, 7.8, 7.85, 7.6, 7.1, 5.55, 4.35, 4.1, 7.35, 10.7, 14.2, 
17.25, 19.1, 19.8, 20.1, 20.15, 19.9, 18.95, 16.7, 14.4, 13.75, 
12.1, 12.3, 11.4, 10.3, 8.55, 7.45, 7.05, 5.6, 5.95, 4.85, 5.3, 
9.35, 12.7, 16.15, 19.1, 20.5, 21.05, 21.4, 21.4, 21.35, 20.1, 
16.95, 15.8, 15.6, 14.95, 14.15, 13.85)

I want to determine how many events there are which a is above 20 and below 10 for a certain period of time.

Pictorially, this is what I am looking for. Here there are two events (blue and green) where temperature amplitude threshold was achieved. Result should be 2.

enter image description here

============

Another example:

Here temperature was below 10 & above 20 for at least 1 hour, two times (or two events).

Result should be 2.

enter image description here

Data for example 2:

b<-c(20.2, 20.55, 20.85, 21.7, 20.7, 18.7, 17.5, 17.4, 16.65, 17.15, 
15.8, 13.85, 12.55, 11.45, 10.2, 9.3, 8.2, 7.4, 7.25, 6.65, 5.9, 
4.75, 4.5, 4.15, 4.4, 6.25, 8.1, 10.35, 12.4, 14.3, 15.3, 16.3, 
17.25, 17.25, 16.85, 14.45, 12.85, 11.35, 10.2, 9.1, 8.6, 7.35, 
5.9, 4.85, 3.65, 3.3, 2.95, 2.65, 2.45, 4.85, 6.45, 8.25, 9.95, 
11.1, 12.3, 13.2, 13.95, 14.05, 13.15, 10.35, 8.15, 6.6, 6.3, 
6, 7.55, 5.85, 5.05, 4.75, 4.5, 4.75, 4.75, 4.55, 5.15, 8.45, 
12.05, 16.35, 18.9, 20.55, 21.6, 21.45, 21.75, 21.15, 20.05, 
17.75, 16.5, 18.2, 18.05, 17.95, 17.8, 17.55, 17.25, 16.95, 16.6, 
16.35, 16.1, 16.25, 16.4, 17.1, 17.8)
Garn_R
  • 77
  • 5
  • What is the meaning of instance? I see 3 times a is out of boundaries in the plot (a<=10 & a>=20.) – RobertoT Jun 02 '22 at 15:14
  • Thanks for the feedback. Instance in this case means an event. For example, in this dataset temperature was below 10 and above 20 for at least 1 hour only *one time*. If we decided to see amplitude for 2 hours, we would have 0 events. Is that better? Sorry if this is confusing... – Garn_R Jun 02 '22 at 15:21
  • So, when you say only 1 instance in your plot, you mean an instance that lasts more than 1 hour: `sum (z$values[z$lengths>1])` ? – RobertoT Jun 02 '22 at 15:26
  • I uploaded my plot with another data – Garn_R Jun 02 '22 at 15:42
  • Updated my answer. I got finally what you where asking – RobertoT Jun 02 '22 at 17:09
  • I think you are in the right track until "ranges". For the data set I just uploaded, that formula did not yield correct result. I got 3, it was supposed to be 2. Perhaps, you range should always contain 1,0,2,0 or 0,1,0,2 or 0,2,0,1 or 2,0,1,0 (in this order) to be true? – Garn_R Jun 02 '22 at 21:08
  • I don't think adding another 0 to the pattern will help, because then the code will fail when the last group is out of range but still true. There must be a way to check if the previous group `lag(ranges)` has been used to return `TRUE` in the previous pattern and therefore it doesn't matter if it is 1 or 2. But right now I don't see how. – RobertoT Jun 03 '22 at 08:28
  • 1
    Thank you anyways for your help. I will keep trying here. – Garn_R Jun 03 '22 at 12:49

1 Answers1

1

I update my answer using your new example b. I found a solution based on: Find a numeric pattern R .

# Get out of range (10,20)
x = ifelse(b<=10,1,0) # Don't need nested ifelse
# Specify if 2 for upper limit [20,inf) 
x[b>=20]=2

z = rle(x)

> z
Run Length Encoding
 lengths: int [1:10] 5 10 12 12 14 7 14 3 6 16
 values : num [1:10] 2 0 1 0 1 0 1 0 2 0

Like your plot, there is two combinations of going lower limit to upper limit or the other way round: 2-0-1 or 1-0-2. You can do:

ranges = z$values
    
# This line looks for 0 -groups of T in range (10,20)- and the looks if the group before is 1 and the next 2 (low to up) or 2 and 1 (up to low)

x = as.integer(ranges == 0 & ( (lag(ranges)==2 & lead(ranges)==1) | (lag(ranges)==1 & lead(ranges)==2) ) )

x
[1]  0  1  0  0  0  0  0  1  0 NA

You can sum, omitting the NAs, to return 2:

sum(x, na.rm=TRUE)

You could remove 0s and just look for 1 followed by 2 or the other way round but it is the same concept. If you want to keep the z$lengths to work with them later, you could transform rle() output to a dataframe and adapt the code to mutate a new column.

RobertoT
  • 1,663
  • 3
  • 12
  • Hi Roberto, thanks for the message. Your code is definitely closer to where I want to get but not there yet. I want count how many times a<=10 & a>=20 and not |, like a proxy for amplitude. – Garn_R Jun 02 '22 at 15:03