Using GEKKO Python, we have trouble trying to learn a parameter that can vary multiple times per day. In some disciplines this is also known as 'regime detection or regime change detection'. We (me and my colleague Henri ter Hofte from Windesheim University of Applied Sciences) conceived 3 strategies but fail (more below).
Our question(s):
- What are we doing wrong, is there an obvious error in our GEKKO code (more below in the details)?
- Is strategy I doomed to fail, and should we switch to strategy II or II?
- Is GEKKO Python even suitable for doing this kind of regime (change) detection?
Your help is much appreciated.
=== The problem: We have time series data about: (1) CO₂ concentration (2) ventilation rates (or rather: valve fractions, which give ventilation rates, when multiplied with known maximum ventilation rates) (3) occupancy (number of persons in a room)
For research question (A) we would like to know a proper estimate for (2) for each hour of the day, given time series data about (1) and (3). For research question (B) we would like to know a proper estimate for (3) for each hour of the day, given time series data about (1) and (2).
We focus on research question A (but have similar questions for B).
=== The 3 strategies:
We considered 3 different strategies how to implement this using GEKKO Python: Strategy I. Declare the variable valve_frac as a Manipulated Variabe in our GEKKO model (m.MV), since the GEKKO documentation says these variables can be "adjusted by the optimizer to minimize an objective function at every time point". and "Manipulated variables are like FVs, but can change with each data row, either calculated by the optimizer (STATUS=1) or specified by the user (STATUS=0)." according to https://gekko.readthedocs.io/en/latest/imode.html#mv
Strategy II. Split the time into several shorter time spans (e.g. one time span per hour) and then learn valve_frac as a GEKKO Fixed Variable (m.FV), one for each hour.
Strategy III. Reframe the problem to GEKKO as if it were a control problem: the setpoint is reaching a particular CO2 concentration and GEKKO has can use valve_frac as a Control Variable (m.CV)
We tried implementing strategy I (see more info and code below) but fail to get proper results. Considering some equation derived from physics, we intend to find the best value for some specific variable (valve_frac__0 variable in following table. Having a dataframe (df_learn) like this:
Index | Date-Time | occupancy__p | valve_frac__0 | co2__ppm |
---|---|---|---|---|
1 | 2022.12.01 – 00:00:00 | 0 | 0.51 | 546 |
2 | 2022.12.01 – 00:15:00 | 4 | 0.85 | 820 |
3 | 2022.12.01 – 00:30:00 | 1 | 0.21 | 595 |
4 | 2022.12.01 – 00:45:00 | 2 | 0.74 | 635 |
5 | 2022.12.01 – 00:15:00 | 0 | 0.65 | 559 |
6 | 2022.12.01 – 00:15:00 | 0 | 0.45 | 538 |
7 | 2022.12.01 – 00:15:00 | 2 | 0.82 | 659 |
. | . | . | . | . |
. | . | . | . | . |
. | . | . | . | . |
1920 | 2022.12.20 – 00:15:00 | 3 | 0.73 | 749 |
We are trying to develop a moving horizon estimation model (IMODE=5) or Control model (IMODE=6) to predict the valve_frac__0 value. Following comes the code in Gekko format:
=== Code:
from gekko import GEKKO
# Gekko Model - Initialize
m = GEKKO(remote = False)
m.time = np.arange(0, duration__s, step__s)
# Conversion factors
s_min_1 = 60
min_h_1 = 60
s_h_1 = s_min_1 * min_h_1
mL_m_3 = 1e3 * 1e3
million = 1e6
# Constants
MET__mL_min_1_kg_1_p_1 = 3.5
desk_work__MET = 1.5
P_std__Pa = 101325
R__m3_Pa_K_1_mol_1 = 8.3145
T_room__degC = 20.0
T_std__degC = 0.0
T_zero__K = 273.15
T_std__K = T_zero__K + T_std__degC
T_room__K = T_zero__K + T_room__degC
infilt__m2 = 0.001
# Approximations
room__mol_m_3 = P_std__Pa / (R__m3_Pa_K_1_mol_1 * T_room__K)
std__mol_m_3 = P_std__Pa / (R__m3_Pa_K_1_mol_1 * T_std__K)
co2_ext__ppm = 415
# National averages
weight__kg = 77.5
MET__m3_s_1_p_1 = MET__mL_min_1_kg_1_p_1 * weight__kg / (s_min_1 * mL_m_3)
MET_mol_s_1_p_1 = MET__m3_s_1_p_1 * std__mol_m_3
co2_o2 = 0.894
co2__mol0_p_1_s_1 = co2_o2 * desk_work__MET * MET_mol_s_1_p_1
# Room averages
wind__m_s_1 = 3.0
# GEKKO Manipulated Variables: measured values
occupancy__p = m.MV(value = df_learn.occupancy__p.values)
occupancy__p.STATUS = 0; occupancy__p.FSTATUS = 1
# Strategy I:
valve_frac__0 = m.MV(value = df_learn.valve_frac__0.values)
valve_frac__0.STATUS = 1; valve_frac__0.FSTATUS = 0
# Strategy II:
#valve_frac__0 = m.FV(value = df_learn.valve_frac__0.values)
#valve_frac__0.STATUS = 1; valve_frac__0.FSTATUS = 0
# GEKKO Control Varibale (predicted variable)
co2__ppm = m.CV(value = df_learn.co2__ppm.values)
co2__ppm.STATUS = 1; co2__ppm.FSTATUS = 1
# GEKKO - Equations
co2_loss__ppm_s_1 = m.Intermediate((co2__ppm - co2_ext__ppm) * (vent_max__m3_s_1 * valve_frac__0 + wind__m_s_1 * infilt__m2) / room__m3)
co2_gain_mol0_s_1 = m.Intermediate(occupancy__p * co2__mol0_p_1_s_1 / (room__m3 * room__mol_m_3))
co2_gain__ppm_s_1 = m.Intermediate(co2_gain_mol0_s_1 * million)
m.Equation(co2__ppm.dt() == co2_gain__ppm_s_1 - co2_loss__ppm_s_1)
# GEKKO - Solver setting
m.options.IMODE = 5
m.options.EV_TYPE = 1
m.options.NODES = 2
m.solve(disp = False)
The result which I got for each scenario come as follow:
Strategy I:
There is no output for simulated “co2__ppm” and the output value for valve_frac__0 = 0
Strategy II:
There is big difference between simulated and measured “co2__ppm” and the output value for valve_frac__0 = 0.166 (which is not reasonable)