Parameter estimation of multiple datasets in julia DifferentialEquations

Question

I have been looking and I could not find a direct way of using the DifferentialEquations parameter estimation in julia to fit multiple datasets. So, let's say we have this simple differential equation with two parameters:

f1 = function (du,u,p,t)
  du[1] = - p[1]*p[2] * u[1]
end

We have experimental datasets of u[1] vs t. Each dataset has a different value of p[2] and/or different initial conditions. p[1] is the parameter we want to estimate. I can do this by solving the differential equation in a for loop that iterates over the different initial conditions and p[2] values, storing the solutions in an array and create a loss-function against the experimental data. I wonder if there is a way of doing this in fewer lines of code, using, for instance,DiffEqBase.problem_new_parameters to set the conditions of each dataset. This is a very common situation when fitting models to experimental data but I could not find a good example in the documentation.

Thank you in advance,

Best regards

EDIT 1

The case expressed above is just a simplified example. To make it a practical case we could create some fake experimental data from the following code:

using DifferentialEquations

# ODE function
f1 = function (du,u,p,t)
  du[1] = - p[1]*p[2] * u[1]
end

# Initial conditions and parameter values.
# p1 is the parameter to be estimated.
# p2 and u0 are experimental parameters known for each dataset.
u0 = [1.,2.]
p1 = .3
p2 = [.1,.2]
tspan = (0.,10.)
t=linspace(0,10,5)


# Creating data for 1st experimental dataset.
## Experimental conditions: u0 = 1. ; p[2] = .1
prob1 = ODEProblem(f1,[u0[1]],tspan,[p1,p2[1]])
sol1=solve(prob1,Tsit5(),saveat=t)

# Creating data for 2nd experimental dataset.
## Experimental conditions: u0 = 1. ; p[2] = .2
prob2 = ODEProblem(f1,[u0[1]],tspan,[p1,p2[2]])
sol2=solve(prob2,Tsit5(),saveat=t)

# Creating data for 3rd experimental dataset.
## Experimental conditions: u0 = 2. ; p[2] = .1
prob3 = ODEProblem(f1,[u0[2]],tspan,[p1,p2[1]])
sol3=solve(prob3,Tsit5(),saveat=t)

sol1, sol2 and sol3 are now our experimental data, each dataset using a different combination of initial conditions and p[2] (which represents some experimental variable (e.g., temperature, flow...)

The objective is to estimate the value of p[1] using the experimental data sol1, sol2 and sol3 letting DiffEqBase.problem_new_parameters or another alternative iterate over the experimental conditions.

Share some example data set – DeshDeep Singh Jun 13 '18 at 15:34 — DeshDeep Singh, Jun 13 '18 at 15:34

score 5 · Accepted Answer · answered Jun 16 '18 at 13:59

What you can do is create a MonteCarloProblem that solves all three of the problems at once:

function prob_func(prob,i,repeat)
  i < 3 ? u0 = [1.0] : u0 = [2.0]
  i == 2 ? p = (prob.p[1],0.2) : p = (prob.p[1],0.1)
  ODEProblem{true}(f1,u0,(0.0,10.0),p)
end
prob = MonteCarloProblem(prob1,prob_func = prob_func)

@time sol = solve(prob,Tsit5(),saveat=t,parallel_type = :none,
                  num_monte = 3)

Then create a loss function that compares each of the solutions against the 3 datasets and adds their loss together.

loss1 = L2Loss(t,data1)
loss2 = L2Loss(t,data2)
loss3 = L2Loss(t,data3)
loss(sol) = loss1(sol[1]) + loss2(sol[2]) + loss3(sol[3])

Finally, you need to tell it how to relate the optimization parameter(s) to the problem it's solving. Here, our MonteCarloProblem holds a prob that it's pulling p[1] from whenever it's generating a problem. The value that we want to optimize is that p[1], so:

function my_problem_new_parameters(prob,p)
  prob.prob.p[1] = p[1]
  prob
end

Now our objective is exactly those pieces together:

obj = build_loss_objective(prob,Tsit5(),loss,
                           prob_generator = my_problem_new_parameters,
                           num_monte = 3,
                           saveat = t)

Now let's throw that to Optim.jl's Brent method:

using Optim
res = optimize(obj,0.0,1.0)

Results of Optimization Algorithm
 * Algorithm: Brent's Method
 * Search Interval: [0.000000, 1.000000]
 * Minimizer: 3.000000e-01
 * Minimum: 2.004680e-20
 * Iterations: 10
 * Convergence: max(|x - x_upper|, |x - x_lower|) <= 2*(1.5e-08*|x|+2.2e-16): true
 * Objective Function Calls: 11

It found that the overall best value is 0.3 which is the parameter we used to generate the data.

Here's the code in full:

using DifferentialEquations

# ODE function
f1 = function (du,u,p,t)
  du[1] = - p[1]*p[2] * u[1]
end

# Initial conditions and parameter values.
# p1 is the parameter to be estimated.
# p2 and u0 are experimental parameters known for each dataset.
u0 = [1.,2.]
p1 = .3
p2 = [.1,.2]
tspan = (0.,10.)
t=linspace(0,10,5)

# Creating data for 1st experimental dataset.
## Experimental conditions: u0 = 1. ; p[2] = .1
prob1 = ODEProblem(f1,[u0[1]],tspan,[p1,p2[1]])
sol1=solve(prob1,Tsit5(),saveat=t)
data1 = Array(sol1)

# Creating data for 2nd experimental dataset.
## Experimental conditions: u0 = 1. ; p[2] = .2
prob2 = ODEProblem(f1,[u0[1]],tspan,[p1,p2[2]])
sol2=solve(prob2,Tsit5(),saveat=t)
data2 = Array(sol2)

# Creating data for 3rd experimental dataset.
## Experimental conditions: u0 = 2. ; p[2] = .1
prob3 = ODEProblem(f1,[u0[2]],tspan,[p1,p2[1]])
sol3=solve(prob3,Tsit5(),saveat=t)
data3 = Array(sol3)

function prob_func(prob,i,repeat)
  i < 3 ? u0 = [1.0] : u0 = [2.0]
  i == 2 ? p = (prob.p[1],0.2) : p = (prob.p[1],0.1)
  ODEProblem{true}(f1,u0,(0.0,10.0),p)
end
prob = MonteCarloProblem(prob1,prob_func = prob_func)

# Just to show what this looks like
sol = solve(prob,Tsit5(),saveat=t,parallel_type = :none,
            num_monte = 3)

loss1 = L2Loss(t,data1)
loss2 = L2Loss(t,data2)
loss3 = L2Loss(t,data3)
loss(sol) = loss1(sol[1]) + loss2(sol[2]) + loss3(sol[3])

function my_problem_new_parameters(prob,p)
  prob.prob.p[1] = p[1]
  prob
end
obj = build_loss_objective(prob,Tsit5(),loss,
                           prob_generator = my_problem_new_parameters,
                           num_monte = 3,
                           saveat = t)

using Optim
res = optimize(obj,0.0,1.0)

Coming back to this issue, I have a curiosity to add. I've been testing and the use of saveat in the objective function is necessary when it is a montecarlo problem but it is not for a regular ODE problem, I guess it is interpolating sol(t[i]). When trying to optimize a Montecarlo problem whitout using saveat the problem converges to a wrong solution. It would be useful to have at least one of this two options: 1. Specify saveat for each Montecarlo iteration (since different datasets may have different timepoints) 2. Use interpolation in sol[i](t[j]) — JIbab, Nov 26 '18 at 12:53
We probably can handle this on the library side. Please open an issue. — Chris Rackauckas, Nov 26 '18 at 18:22

Parameter estimation of multiple datasets in julia DifferentialEquations

EDIT 1

1 Answers1