I'm using R and I have data as follow:
data<-data.frame(time=c(20230301000000,20230301000010, 20230301000020, 20230301000030,20230301000040,
20230301000050, 20230301000100 , 20230301000110 , 20230301000120, 20230301000130,
20230301000140, 20230301000150, 20230301000200, 20230301000210, 20230301000220,
20230301000230, 20230301000240, 20230301000250,20230301000300),
switch=c(40,41,42,43,0,0,0,51,52,53,0,0,0,55,56,57,52,0,0),
NH4=c(2,2,3,3,3,5,4,9,9,9,10,11,12,4,4,5,5,7,8))
Here, if switch
>0 , then it means that the switch is on. Otherwise, if switch
=0, then it means that the switch is off. switch
always has nonnegative values. I want to calulate the time duration of on and off of switch
in seconds. time
is written in year, month, day, hour, minute, second order. Within each on/ off time duration, I want to calculate the minimum, averge , and maximum value of NH4
. For the switch on case, I will denote it as on_NH4_min
, on_NH4_avg
, and on_NH4_max
. For the switch off case, I will denote it as off_NH4_min
, off_NH4_avg
, off_NH4_max
. So my desired output should looks like this:
summary_data <-data.frame(on_NH4_min=c(2,9,4),
on_NH4_avg=c(2.5,9,4,5),
on_NH4_max=c(3,9,5),
off_NH4_min=c(3,10,7),
off_NH4_avg=c(4,11,7,5),
off_NH4_max=c(5,12,8),
on_time=c(30,20,30),
off_time=c(20,20,10))
Chat GPT gave me an answer, but it didn't work for me. Here is the code from Chat GPT.
library(dplyr)
library(tidyr)
# Create a variable that indicates whether the switch is on or off
data <- data %>%
mutate(switch_on = ifelse(switch > 0, 1, 0))
# Calculate the time duration of each on/off interval
data <- data %>%
mutate(interval = cumsum(switch_on != lag(switch_on, default = 0))) %>%
group_by(interval, switch_on) %>%
mutate(start_time = first(time),
end_time = last(time),
duration = difftime(end_time, start_time, units = "secs")) %>%
ungroup() %>%
select(-interval)
# Calculate the NH4 statistics for each on/off interval
summary_data <- data %>%
group_by(switch_on) %>%
summarise(NH4_min = min(NH4),
NH4_avg = mean(NH4),
NH4_max = max(NH4),
time = sum(duration)) %>%
pivot_wider(names_from = switch_on,
values_from = c(NH4_min, NH4_avg, NH4_max, time),
names_prefix = c("on_", "off_"))
I want to get summary_data
.