Problem using GEKKO to do regime change detection (estimating time-varying variables)

Question

Using GEKKO Python, we have trouble trying to learn a parameter that can vary multiple times per day. In some disciplines this is also known as 'regime detection or regime change detection'. We (me and my colleague Henri ter Hofte from Windesheim University of Applied Sciences) conceived 3 strategies but fail (more below).

Our question(s):

What are we doing wrong, is there an obvious error in our GEKKO code (more below in the details)?
Is strategy I doomed to fail, and should we switch to strategy II or II?
Is GEKKO Python even suitable for doing this kind of regime (change) detection?

Your help is much appreciated.

=== The problem: We have time series data about: (1) CO₂ concentration (2) ventilation rates (or rather: valve fractions, which give ventilation rates, when multiplied with known maximum ventilation rates) (3) occupancy (number of persons in a room)

For research question (A) we would like to know a proper estimate for (2) for each hour of the day, given time series data about (1) and (3). For research question (B) we would like to know a proper estimate for (3) for each hour of the day, given time series data about (1) and (2).

We focus on research question A (but have similar questions for B).

=== The 3 strategies:

We considered 3 different strategies how to implement this using GEKKO Python: Strategy I. Declare the variable valve_frac as a Manipulated Variabe in our GEKKO model (m.MV), since the GEKKO documentation says these variables can be "adjusted by the optimizer to minimize an objective function at every time point". and "Manipulated variables are like FVs, but can change with each data row, either calculated by the optimizer (STATUS=1) or specified by the user (STATUS=0)." according to https://gekko.readthedocs.io/en/latest/imode.html#mv

Strategy II. Split the time into several shorter time spans (e.g. one time span per hour) and then learn valve_frac as a GEKKO Fixed Variable (m.FV), one for each hour.

Strategy III. Reframe the problem to GEKKO as if it were a control problem: the setpoint is reaching a particular CO2 concentration and GEKKO has can use valve_frac as a Control Variable (m.CV)

We tried implementing strategy I (see more info and code below) but fail to get proper results. Considering some equation derived from physics, we intend to find the best value for some specific variable (valve_frac__0 variable in following table. Having a dataframe (df_learn) like this:

Index	Date-Time	occupancy__p	valve_frac__0	co2__ppm
1	2022.12.01 – 00:00:00	0	0.51	546
2	2022.12.01 – 00:15:00	4	0.85	820
3	2022.12.01 – 00:30:00	1	0.21	595
4	2022.12.01 – 00:45:00	2	0.74	635
5	2022.12.01 – 00:15:00	0	0.65	559
6	2022.12.01 – 00:15:00	0	0.45	538
7	2022.12.01 – 00:15:00	2	0.82	659
.	.	.	.	.
.	.	.	.	.
.	.	.	.	.
1920	2022.12.20 – 00:15:00	3	0.73	749

We are trying to develop a moving horizon estimation model (IMODE=5) or Control model (IMODE=6) to predict the valve_frac__0 value. Following comes the code in Gekko format:

=== Code:

from gekko import GEKKO
# Gekko Model - Initialize
m = GEKKO(remote = False)
m.time = np.arange(0, duration__s, step__s)

# Conversion factors
s_min_1 = 60
min_h_1 = 60
s_h_1 = s_min_1 * min_h_1
mL_m_3 = 1e3 * 1e3
million = 1e6

# Constants
MET__mL_min_1_kg_1_p_1 = 3.5                                  
desk_work__MET = 1.5                                 
P_std__Pa = 101325                                            
R__m3_Pa_K_1_mol_1 = 8.3145                                   
T_room__degC = 20.0                                           
T_std__degC = 0.0                                             
T_zero__K = 273.15                                           
T_std__K = T_zero__K + T_std__degC                            
T_room__K = T_zero__K + T_room__degC
infilt__m2 = 0.001                          

# Approximations
room__mol_m_3 = P_std__Pa / (R__m3_Pa_K_1_mol_1 * T_room__K)  
std__mol_m_3 = P_std__Pa / (R__m3_Pa_K_1_mol_1 * T_std__K)    
co2_ext__ppm = 415                                            
        
# National averages
weight__kg = 77.5                                             
MET__m3_s_1_p_1 = MET__mL_min_1_kg_1_p_1 * weight__kg / (s_min_1 * mL_m_3)
MET_mol_s_1_p_1 = MET__m3_s_1_p_1 * std__mol_m_3           
co2_o2 = 0.894                                           
co2__mol0_p_1_s_1 = co2_o2 * desk_work__MET * MET_mol_s_1_p_1    

# Room averages
wind__m_s_1 = 3.0                                             

# GEKKO Manipulated Variables: measured values
occupancy__p = m.MV(value = df_learn.occupancy__p.values)
occupancy__p.STATUS = 0; occupancy__p.FSTATUS = 1

# Strategy I:
valve_frac__0 = m.MV(value = df_learn.valve_frac__0.values)
valve_frac__0.STATUS = 1; valve_frac__0.FSTATUS = 0

# Strategy II:
#valve_frac__0 = m.FV(value = df_learn.valve_frac__0.values)
#valve_frac__0.STATUS = 1; valve_frac__0.FSTATUS = 0

# GEKKO Control Varibale (predicted variable)
co2__ppm = m.CV(value = df_learn.co2__ppm.values) 
co2__ppm.STATUS = 1; co2__ppm.FSTATUS = 1

# GEKKO - Equations
co2_loss__ppm_s_1 = m.Intermediate((co2__ppm - co2_ext__ppm) * (vent_max__m3_s_1 * valve_frac__0 + wind__m_s_1 * infilt__m2) / room__m3)

co2_gain_mol0_s_1 = m.Intermediate(occupancy__p * co2__mol0_p_1_s_1 / (room__m3 * room__mol_m_3))

co2_gain__ppm_s_1 = m.Intermediate(co2_gain_mol0_s_1 * million)

m.Equation(co2__ppm.dt() == co2_gain__ppm_s_1 - co2_loss__ppm_s_1)

# GEKKO - Solver setting
m.options.IMODE = 5
m.options.EV_TYPE = 1
m.options.NODES = 2
m.solve(disp = False)

The result which I got for each scenario come as follow:

Strategy I:

There is no output for simulated “co2__ppm” and the output value for valve_frac__0 = 0

Strategy II:

There is big difference between simulated and measured “co2__ppm” and the output value for valve_frac__0 = 0.166 (which is not reasonable)

John Hedengren · Accepted Answer · 2023-01-30T06:11:42.807

The code looks like it should work as long as valve_frac__0 is the adjustable unknown parameter that should be estimated from the CO2 PPM data. Here is a result on a smaller subset of the posted data.

The data doesn't fit exactly if there is a lower bound of zero on the valve position.

valve_frac__0 = m.MV(value = valve_frac__0,lb=0)

Otherwise, the valve position can be adjusted to fit the CO2 data perfectly.

Here is a complete script with the sample data.

from gekko import GEKKO
import numpy as np
# Gekko Model - Initialize
m = GEKKO(remote = False)

# data
# 1 2022.12.01 – 00:00:00   0   0.51    546
# 2 2022.12.01 – 00:15:00   4   0.85    820
# 3 2022.12.01 – 00:30:00   1   0.21    595
# 4 2022.12.01 – 00:45:00   2   0.74    635
# 5 2022.12.01 – 00:15:00   0   0.65    559
# 6 2022.12.01 – 00:15:00   0   0.45    538
# 7 2022.12.01 – 00:15:00   2   0.82    659

occupancy__p = np.array([0,4,1,2,0,0,2])
valve_frac__0 = np.array([0.51,0.85,0.21,0.74,0.65,0.45,0.82])
co2__ppm_meas = np.array([546,820,595,635,559,538,659])

duration__s = len(co2__ppm_meas)
m.time = np.linspace(0,duration__s-1,duration__s)

vent_max__m3_s_1 = 1
room__m3 = 1

# Conversion factors
s_min_1 = 60
min_h_1 = 60
s_h_1 = s_min_1 * min_h_1
mL_m_3 = 1e3 * 1e3
million = 1e6

# Constants
MET__mL_min_1_kg_1_p_1 = 3.5                                  
desk_work__MET = 1.5                                 
P_std__Pa = 101325                                            
R__m3_Pa_K_1_mol_1 = 8.3145                                   
T_room__degC = 20.0                                           
T_std__degC = 0.0                                             
T_zero__K = 273.15                                           
T_std__K = T_zero__K + T_std__degC                            
T_room__K = T_zero__K + T_room__degC
infilt__m2 = 0.001                          

# Approximations
room__mol_m_3 = P_std__Pa / (R__m3_Pa_K_1_mol_1 * T_room__K)  
std__mol_m_3 = P_std__Pa / (R__m3_Pa_K_1_mol_1 * T_std__K)    
co2_ext__ppm = 415                                            
        
# National averages
weight__kg = 77.5                                             
MET__m3_s_1_p_1 = MET__mL_min_1_kg_1_p_1 \
                * weight__kg / (s_min_1 * mL_m_3)
MET_mol_s_1_p_1 = MET__m3_s_1_p_1 * std__mol_m_3           
co2_o2 = 0.894                                           
co2__mol0_p_1_s_1 = co2_o2 * desk_work__MET * MET_mol_s_1_p_1    

# Room averages
wind__m_s_1 = 3.0                                             

# GEKKO Manipulated Variables: measured values
occupancy__p = m.MV(value = occupancy__p)
occupancy__p.STATUS = 0; occupancy__p.FSTATUS = 1

# Strategy I:
valve_frac__0 = m.MV(value = valve_frac__0,lb=0)
valve_frac__0.STATUS = 1; valve_frac__0.FSTATUS = 0

# Strategy II:
#valve_frac__0 = m.FV(value = df_learn.valve_frac__0.values)
#valve_frac__0.STATUS = 1; valve_frac__0.FSTATUS = 0

# GEKKO Control Varibale (predicted variable)
co2__ppm = m.CV(value = co2__ppm_meas) 
co2__ppm.STATUS = 1; co2__ppm.FSTATUS = 1

# GEKKO - Equations
co2_loss__ppm_s_1 = m.Intermediate((co2__ppm - co2_ext__ppm) \
                      * (vent_max__m3_s_1 * valve_frac__0 \
                      + wind__m_s_1 * infilt__m2) / room__m3)

co2_gain_mol0_s_1 = m.Intermediate(occupancy__p \
            * co2__mol0_p_1_s_1 / (room__m3 * room__mol_m_3))

co2_gain__ppm_s_1 = m.Intermediate(co2_gain_mol0_s_1 * million)

m.Equation(co2__ppm.dt() == co2_gain__ppm_s_1 - co2_loss__ppm_s_1)

# GEKKO - Solver setting
m.options.IMODE = 5
m.options.EV_TYPE = 1
m.options.NODES = 2
m.options.SOLVER = 1
m.solve(disp = True) 


import matplotlib.pyplot as plt

plt.subplot(2,1,1)
plt.plot(m.time,valve_frac__0.value,'r-',label='Valve Frac')
plt.legend(); plt.grid(); plt.ylabel('Valve Frac')
plt.subplot(2,1,2)
plt.plot(m.time,co2__ppm_meas,'ko',label='Measured')
plt.plot(m.time,co2__ppm.value,'k--',label='Predicted')
plt.legend(); plt.grid()
plt.xlabel('Time'); plt.ylabel('CO2')
plt.savefig('results.png',dpi=300)
plt.show()

For question B, adjust the code to make the valve position fixed at the measured values and the occupancy determined by the optimizer.

occupancy__p = m.MV(value = occupancy__p)
occupancy__p.STATUS = 1; occupancy__p.FSTATUS = 0

# Strategy I:
valve_frac__0 = m.MV(value = valve_frac__0,lb=0)
valve_frac__0.STATUS = 0; valve_frac__0.FSTATUS = 1

Use occupancy__p.MV_STEP_HOR = 2 or higher decrease the frequency at which the optimized parameter can change (e.g. every 2 hours).

Thanks for your answer. It is helpful for predicting valve_frac__0 which is a float variable. However, We faced some problems when we want to predict occupancy__p which is an integer variable. According to your suggestion, we change the setting for this case which would be: occupancy__p = m.MV(value = occupancy__p) occupancy__p.STATUS = 1; occupancy__p.FSTATUS = 0 & valve_frac__0 = m.MV(value = valve_frac__0,lb=0) valve_frac__0.STATUS = 0; valve_frac__0.FSTATUS = 1 with m.options.EV_TYPE = 1 m.options.SOLVER = 1 but after hours of running, the model comes up with “Solution Not Found”. — Hossein, Mar 15 '23 at 10:31
Moreover, m.options.Solver=3 wiht m.options.EV_TYPE=2 is tried which predicts float numbers for occupancy (number of people in the room) which is not reasonable. It would be great if you could help us in this case. — Hossein, Mar 15 '23 at 10:34
Use `occupancy__p = m.MV(value = occupancy__p, integer=True)` to specify an integer value. Only the APOPT solver can find integer solutions. Could you create an additional, new question that describes the issue with the specific code that replicates the error? — John Hedengren, Mar 16 '23 at 21:28

Problem using GEKKO to do regime change detection (estimating time-varying variables)

Strategy I:

Strategy II:

1 Answers1

Linked