5

I'm using Python GEKKO to model a chemical reaction, which can be described like this:

1 -> 2 -> 3 -> 4

with side reactions as follows:

2 -> 5

3 -> 5

The product (4) ist stable. This leads to the following set of ODEs (rate equations), with rate constants k and the concentrations of the components c(i).

m.Equation(c_1p.dt() == -k1*c_1p)
m.Equation(c_2p.dt() == k1*c_1p - (k21 + k22)*c_2p)
m.Equation(c_3p.dt() == k21*c_2p - (k31 + k32)*c_3p)
m.Equation(c_4p.dt() == k31*c_3p)

I implemented these equations in GEKKO to estimate the rate constants. I initialized my measured values as parameters, the predicted values as variables and the rate constants as fixed values (with limitations). Objective function is least square method. This works fine and the results are within expectation (R2 > 0.99).

The problem is, that when I try and validate these results by using the calculated rate constants to solve the ODEs (with GEKKO or scipy odeint), I get a different result (see figure 1). Points are the measured values, X marks the predicted values, the dashed lines represent the curves that are calculated with odeint using the calculated rate constants.

Concentration plot

Question is, where does this deviation originate? I can't find the source.

Thank you!

from gekko import GEKKO
import numpy as np
import matplotlib.pyplot as plt
from scipy.integrate import odeint

m = GEKKO(remote=False)
p = GEKKO(remote=False)

##  Arrays: Measurement Data

c_1 = [29.99122982,  1.24267302,1,1,1,1]
c_2 = [92.163874642, 34.957203757, 15.868747086, 0.0, 0.0, 0.0]
c_3 = [1.5290222821, 7.3374483783, 3.1998472876, 0.66149919406, 0.069616016241, 0.0071280867007]
c_4 = [0.0, 38.487210404, 51.285418999, 57.66299199, 60.869003531, 64.89535785]

m.time = [0,15,30,60,120,180]
t = m.time

p.time = np.linspace(0,180,181)
t_p = p.time


##Variables

c_1m = m.Param(value=c_1)      
c_2m = m.Param(value=c_2)
c_3m = m.Param(value=c_3)
c_4m = m.Param(value=c_4)

c_1p = m.Var(value=c_1m)          
c_2p = m.Var(value=c_2m)
c_3p = m.Var(value=c_3m)
c_4p = m.Var(value=c_4m)

c_1pp = p.Var(value = c_1[0])    
c_2pp = p.Var(value = c_2[0])
c_3pp = p.Var(value = c_3[0])
c_4pp = p.Var(value = c_4[0])

k1 = m.FV(lb = 0, ub = 2)
k1.STATUS = 1
k21 = m.FV(lb = 0)
k21.status = 1
k22 = m.FV(lb = 0)
k22.status = 1
k31 = m.FV(lb = 0)
k31.status = 1
k32 = m.FV(lb = 0)
k32.status = 1

##m.Equations

m.Equation(c_1p.dt() == -k1*c_1p)
m.Equation(c_2p.dt() == k1*c_1p - k21*c_2p - k22*c_2p)
m.Equation(c_3p.dt() == k21*c_2p - k31*c_3p - k32*c_3p)
m.Equation(c_4p.dt() == k31*c_3p)


##Objective

m.Obj((c_4p - c_4m)**2 + (c_3p-c_3m)**2 + (c_2p-c_2m)**2 + (c_1p-c_1m)**2)

##Application options

m.options.IMODE = 5
m.options.solver = 3

p.options.IMODE = 4

##Solve

m.solve()

##p.Equations

p.Equation(c_1pp.dt() == -k1[0]*c_1pp)
p.Equation(c_2pp.dt() == k1[0]*c_1p - k21[0]*c_2pp - k22[0]*c_2pp)
p.Equation(c_3pp.dt() == k21[0]*c_2p - k31[0]*c_3pp - k32[0]*c_3pp)
p.Equation(c_4pp.dt() == k31[0]*c_3pp)

p.solve()

def rxn(C,t_p):

    c_10 = C[0]
    c_20 = C[1]
    c_30 = C[2]
    c_40 = C[3]

    d1dt = -k1[0] * c_10
    d2dt = k1[0] * c_10 - (k21[0] + k22[0]) * c_20
    d3dt = k21[0] * c_20 - (k31[0] + k32[0]) * c_30
    d4dt = k31[0] * c_30

    return [d1dt, d2dt, d3dt, d4dt]

C = [29.991229823,92.163874642,1.5290222821,0.0]
model = odeint(rxn,C,t_p)

##Plot/Print

print('c_4m = ' +  str(c_4m.VALUE))
print('c_4p = ' +  str(c_4p.VALUE))
print('c_3m = ' +  str(c_3p.VALUE))
print('c_3p = ' + str(c_3m.VALUE))
print('c_2m = ' + str(c_2m.VALUE))
print('c_2p = ' + str(c_2p.VALUE))
print('c_1p = ' +str(c_1p.VALUE))
print('c_1m = ' + str(c_1m.value))
print('k1 = ' + str(k1.VALUE))
print('k21 = ' + str(k21.VALUE))
print('k22 = ' + str(k22.VALUE))
print('k31 = ' + str(k31.VALUE))
print('k32 = ' + str(k32.VALUE))


plt.figure(1)
plt.plot(t,c_1m,'ro', label = "1 meas")
plt.plot(t,c_1p, 'rx', label = "1 pred")
plt.plot(p.time, model[:,0] ,'r--', label= "1 mod")
plt.plot(t,c_2m,'go', label = "2 meas")
plt.plot(t,c_2p,'gx', label = "2 pred")
plt.plot(p.time,model[:,1],'g--', label = "2 mod")
plt.plot(t,c_3m,'yo', label = "3 meas")
plt.plot(t, c_3p, 'yx', label= "3 pred")
plt.plot(p.time,model[:,2],'y--', label = "3 mod")
plt.plot(t,c_4m,'bo', label="4 meas")
plt.plot(t,c_4p,'bx', label = "4 pred")
plt.plot(p.time,model[:,3],'b--', label = "4 mod")
plt.xlabel('Time [min]')
plt.ylabel('Concentration [mmol/l]')
plt.legend(loc='best', ncol = 4)
plt.grid()
plt.show()
Yann
  • 2,426
  • 1
  • 16
  • 33
lois_power
  • 53
  • 3

1 Answers1

2

The problem is resolved by increasing the number of nodes for each time step.

##Application options
ND = 3
m.options.IMODE = 5
m.options.NODES = ND

p.options.IMODE = 4
p.options.NODES = ND

Improved accuracy

Gekko requires users to specify the time points where they would like to compute a solution. In your case, they correspond to the measurements. You just need at least one internal node (NODES=3) to improve the accuracy of the ODE integration. The accuracy generally increases with more nodes (2-6) but there is also an increase in computational cost because the model is calculated at all of the nodes. There is more information on orthogonal collocation on finite elements in the Machine Learning and Dynamic Optimization course.

John Hedengren
  • 12,068
  • 1
  • 21
  • 25
  • 2
    It seems that `NODES=3` could be a better default option in Gekko. Is `NODES=2` equivalent to an implicit Euler's method? – TexasEngineer Feb 22 '20 at 02:27
  • 2
    Good observation. `NODES=2` is equivalent to an implicit Euler's method and requires more time points to achieve the required accuracy than a higher order method for each time step. One of the reasons that it is the default is because it is required for compatibility with some types of models such as discrete state space. Perhaps we move the default to `NODES=3` and then give an error message if `NODES>=3` when using the incompatible models. – John Hedengren Feb 22 '20 at 02:31