The following problem:
I have a measurement series with temperatures (see temp.tbl
).
temp.tbl <- data.frame(temp = c(13, 14, 13, 11, 10, 12,
16, 18, 16, 13, 10, 11,
12, 14, 12, 10, 8, 7, 5)
)
Each time the temperature in the temp
column falls below a certain threshold (here: 12), I want to check how many subsequent temperature values remain within a temperature range (here: 9 - 15):
- The first value in the
temp
column that falls below the threshold (12) and all subsequent values (within the range 9 - 15) will receive the value 1 in a new column (grp
). - The value that leaves the range for the first time above or below it also gets the value 1 in the
grp
column. In other words, these values are "combined" and assigned to group 1. - If the temperature falls below the threshold value (12) at a later time for the second time (after having left the range before), this value and the temperatures within the range get the value 2 in the
grp
column. These values are assigned to group 2. - Each subsequent fall below the threshold and the subsequent "range values" have been "grouped" according to their occurrence (group 3, group 4, etc.).
- All other rows receive the value 0 in the
grp
column.
The solution would look something like this (see temp_solution.tbl
):
temp_solution.tbl <- data.frame(temp = c(13, 14, 13, 11, 10, 12, 16, 18, 16, 13, 10, 11, 12, 14, 12, 10, 8, 7, 5),
id = c(0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 2, 2, 2, 2, 2, 2, 2, 0, 0),
onOff = c("off", "off", "off", "on", "on", "on", "on", "off", "off", "off", "on",
"on", "on", "on", "on", "on", "on", "off", "off"),
startEnd = c("off", "off", "off", "start", "on", "on", "end", "off", "off", "off", "start",
"on", "on", "on", "on", "on", "end", "off", "off")
)
temp_solution.tbl
temp id onOff startEnd
1 13 0 off off
2 14 0 off off
3 13 0 off off
4 11 1 on start # temp below threshold
5 10 1 on on # temp within limits
6 12 1 on on # temp within limits
7 16 1 on end # first temp outside limits
8 18 0 off off
9 16 0 off off
10 13 0 off off
11 10 2 on start # temp below threshold
12 11 2 on on # temp within limits
13 12 2 on on
14 14 2 on on
15 12 2 on on
16 10 2 on on
17 8 2 on end # first temp outside limits
18 7 0 off off
19 5 0 off off
- The values in column
temp
of rows 1 - 3 are all above the threshold (12) and get the value 0 in columngrp
. - The value in column temp in row 4 falls below the threshold for the first time: a 1 is inserted in column
grp
. - The temperature values in rows 5 - 6 are within the range and are assigned to group 1. The value in row 7 leaves the range for the first time (it exceeds the range) and also receives a 1 in the
grp
column. - Rows 8 - 10 are above the threshold value and do not fall below it:
grp
= 0. - Line 11 falls below the threshold value for the second time:
grp
= 2. - Lines 12 - 16 are within the range after falling below the threshold value:
grp
= 2. - Line 17 leaves the range for the first time (downwards) and also receives
grp
= 2. - Lines 18 - 19:
grp
= 0
The columns onOff
and startEnd
are only for illustration.
The starting point can be determined e.g. with: temp < 12 & lag(temp > 12)
.
The range again e.g. with: between(temp, 9, 15)
.
temp.tbl %>%
mutate(
start = temp < 12 & lag(temp > 12),
range = between(temp, 9, 15)
)
But if a start point was "triggered" I have problems to connect this start point with the following range condition to get the values to a coherent "measurement series" (switch is ON
).
In other words, if the value in the start
column changes from FALSE
to TRUE
, the switch is set to ON
. As long as the subsequent values in the range
column are TRUE
, the switch remains ON
; if the value in range
changes to FALSE
, the switch flips to OF
. All subsequent ON
s form a group and are numbered in ascending order.