2

I'm trying to calculate a matrix-form ODE, i.e., (∂/∂t)ψ = -iHψ ,where ψ is a vector and H is a (time-independent) matrix.
I have two problems.

1. Difference of calculation time between DifferentialEquations.jl and Runge Kutta code
I calculated the above equation in two ways.
DifferentialEquations.jl

    using LinearAlgebra    
    using OrdinaryDiffEq   
    using DifferentialEquations

    #matrix
    function Ham(Lx,Ly,Lz) 
     4*SINk₁(Lx,Ly,Lz) + 3*im*SINk₂(Lx,Ly,Lz) + COSk₃(Lx,Ly,Lz) -im*EYE(Lx,Ly,Lz)
    end

    # Initial conditions
    ψ0 = Complex{Float64}[] 
    σ = sqrt(5/2)

    for iz = 1:Lz        
     for ix = 1:Lx
      for iy=1:Ly             
        gauss = (1/(((sqrt(2*π))*σ)^3))exp(-((ix-5)^2 + (iy-5)^2 + (iz-5)^2)/(2*(σ)^2))
        push!(ψ0,gauss) 
      end
     end
    end

    #normalization
    ψ0 = ψ0./norm(ψ0)

    #time span
    tspan = (0.0,20.0) 

    #ODE
    function time_evolution(ψdot::Array{Complex{Float64}, 1},ψ::Array{Complex{Float64}, 1},para::Float64,t::Float64)
     ψdot.=-im.*Ham(Lx,Ly,Lz)*ψ
    end

    #solve the equation
    prob = ODEProblem(time_evolution,ψ0,tspan)
    @time sol = solve(prob)

Where SINk₁(Lx,Ly,Lz),SINk₂(Lx,Ly,Lz),COSk₃(Lx,Ly,Lz),EYE(Lx,Ly,Lz) are

 function EYE(Lx,Ly,Lz)
     N = Lx*Ly*Lz
     mat = Matrix{Complex{Float64}}(I, N, N) 
     return mat
 end


 function SINk₁(Lx,Ly,Lz)
  N = Lx*Ly*Lz

mat = zeros(Complex{Float64},N,N) 

for ix = 1:Lx
    for iy = 1:Ly 
        for iz = 1:Lz 
            for dx in -1:1             
             jx = ix + dx 
                for dy in -1:1
                jy = iy + dy
                    for dz in -1:1
                    jz = iz + dz
                    
                    
                        ii = (iz-1)*Lx*Ly + (ix-1)*Ly + iy 
                        jj = (jz-1)*Lx*Ly + (jx-1)*Ly + jy

                            if 1 <= jx <= Lx && 1 <= jy <= Ly && 1 <= jz <= Lz
                                                     
                                if dx == +1 && dy == 0 && dz == 0
                                    mat[ii,jj] += -(im/2) 
                                end
                
                                if dx == -1 && dy == 0 && dz == 0
                                    mat[ii,jj] += im/2
                                end
                                   
                            end    
                
                    end
                end
            end
        end
    end
end

return mat
end


function SINk₂(Lx,Ly,Lz)
  N = Lx*Ly*Lz

mat = zeros(Complex{Float64},N,N) 

for ix = 1:Lx
    for iy = 1:Ly 
        for iz = 1:Lz 
            for dx in -1:1             
             jx = ix + dx 
                for dy in -1:1
                jy = iy + dy
                    for dz in -1:1
                    jz = iz + dz
                    
                    
                        ii = (iz-1)*Lx*Ly + (ix-1)*Ly + iy 
                        jj = (jz-1)*Lx*Ly + (jx-1)*Ly + jy

                            if 1 <= jx <= Lx && 1 <= jy <= Ly && 1 <= jz <= Lz
                                                     
                                if dx == 0 && dy == +1 && dz == 0
                                    mat[ii,jj] += -im/2
                                end
                
                                if dx == 0 && dy == -1 && dz == 0
                                    mat[ii,jj] += im/2
                                end
                                   
                            end    
                
                    end
                end
            end
        end
    end
end

return mat
end


function COSk₃(Lx,Ly,Lz)
 N = Lx*Ly*Lz

mat = zeros(Complex{Float64},N,N) 

for ix = 1:Lx
    for iy = 1:Ly 
        for iz = 1:Lz 
            for dx in -1:1             
             jx = ix + dx 
                for dy in -1:1
                jy = iy + dy
                    for dz in -1:1
                    jz = iz + dz
                    
                    
                        ii = (iz-1)*Lx*Ly + (ix-1)*Ly + iy 
                        jj = (jz-1)*Lx*Ly + (jx-1)*Ly + jy

                            if 1 <= jx <= Lx && 1 <= jy <= Ly && 1 <= jz <= Lz
                                                     
                                if dx == 0 && dy == 0 && dz == +1
                                    mat[ii,jj] += 1/2 
                                end
                
                                if dx == 0 && dy == 0 && dz == -1
                                    mat[ii,jj] += 1/2
                                end
                                   
                            end    
                
                    end
                end
            end
        end
    end
end

return mat
end

For matrix size Lx = Ly = Lz = 10, this takes about 50 secs:

  49.591043 seconds (28.10 k allocations: 141.020 GiB, 21.55% gc time)

Runge Kutta code

function rk4(ψ::Array{Complex{Float64}, 1},H::Array{Complex{Float64}, 2},δt::Float64)

f(a) = -im*H*a

k1 = f(ψ)
k2 = f(ψ + (1/2)*δt*k1)
k3 = f(ψ + (1/2)*δt*k2)
k4 = f(ψ + δt*k3)
ψ = ψ + δt*(1/6)*(k1 + 2*k2 + 2*k3 + k4)

return ψ

end


function RK4(ti::Int64,tf::Int64,interval::Int64,ψ0::Array{Complex{Float64}, 1})

#time span
δt = (tf-ti)/interval

size = length(ψ0)

#container of solutions
list_t = zeros(Float64,interval+1)
list_ψ = zeros(Complex{Float64},interval+1,size)

#initial condition
list_t[1] = ti
list_ψ[1,1:size] = ψ0

 #store solutions
for i in 2:interval+1
    list_ψ[i,1:size] = rk4(list_ψ[i-1,1:size],H,δt)
    list_t[i] = list_t[i-1]+δt
end

return list_t,list_ψ

end


#matrix
H = Ham(Lx,Ly,Lz) 

#time span
ti = 0
tf = 20
interval = 60

@time list_t1,list_ψ1 = RK4(ti,tf,interval,ψ0)

For matrix size Lx = Ly = Lz = 10 and the same δt, this takes about 2.0 secs:

 1.882741 seconds (1.45 k allocations: 3.592 GiB, 24.03% gc time)

What causes the difference?

2. Difference of calculation time between Julia and Python

Even in the Runge Kutta code,it takes too much times for large matrix size.
For example, Lx = Ly = Lz = 20, i.e. 8000×8000 matrix, takes about 130 secs.

However, in python, I heard Runge Kutta 4th order method takes only few seconds even for 100000*100000 matrix.
Why so fast?
The same calculation time is possible in julia?

(2020/10/24)
I checked the calculation time using adaptive=false in julia. Since installing BenchmarkTools was failed for some reason,I used @time macro.
For the above Ham(Lx,Ly,Lz),ψ0 and Lx = Ly = Lz = 10,

RK4

#time span
ti = 0
tf = 20
interval = 60  #dt=1/3

@time list_t1,list_ψ1 = RK4(ti,tf,interval,ψ0)
0.983133 seconds (1.40 M allocations: 3.657 GiB, 18.14% gc time)

DifferentialEquations.jl

controlling the accuracy

#time span
tspan = (0.0,20.0) 

prob = ODEProblem(time_evolution,ψ0,tspan)
@time sol = solve(prob)
32.249755 seconds (28.10 k allocations: 141.020 GiB, 29.34% gc time)

not controlling the accuracy

#time span
tspan = (0.0,20.0)

prob = ODEProblem(time_evolution,ψ0,tspan)
@time sol = solve(prob,RK4(),dt=1/3,adaptive=false)
7.910114 seconds (6.38 k allocations: 35.919 GiB, 29.14% gc time)

Because of a fixed dt, it surely became faster in DifferentialEquations.jl.
However, it doesn't match the RK4 one despite using the same dt.
I don't know why...

No matter how hard I try, it can't be faster as python?

(2020/10/29)
From my friend's advice, Runge Kutta code was improved.

function rk4(ψ::Array{Complex{Float64}, 1},H,δt::Float64)
    
    f(a) = -im*mul!(similar(a),H,a)
    
    k1 = f(ψ)
    k2 = f(ψ + (1/2)*δt*k1)
    k3 = f(ψ + (1/2)*δt*k2)
    k4 = f(ψ + δt*k3)
    ψ = ψ + δt*(1/6)*(k1 + 2*k2 + 2*k3 + k4)
    
    return ψ
    
end

Then, using H = sparse(Ham(Lx,Ly,Lz)) yielded 0.007810 seconds (1.33 k allocations: 19.391 MiB) for the same setting.

yosuga
  • 341
  • 1
  • 10

1 Answers1

3

You didn't compare the same things. The DifferentialEquations.jl one is adaptive time stepping to hit the tolerances. The other RK4 is fixed time step and gets whatever error comes out. Of course calculating the solution with a few orders of magnitude more accuracy is more expensive! They are taking drastically different numbers of time steps so of course the computational cost is different. Try:

using BenchmarkTools
@btime sol = solve(prob,RK4(),dt=...,adaptive=false)

and if both are using the same dt then they should be the same.

Why so fast?

Because it's not controlling the accuracy, so it's really quickly giving an answer with an uncontrolled amount of error, which you can force DifferentialEquations.jl to do but it's not going to do it by default since it's trying to solve the ODE.

Chris Rackauckas
  • 18,645
  • 3
  • 50
  • 81
  • Thank you for the answer. I almost understood the differences. By the way, I want to know how to force DifferentialEquations.jl not to controll the accuracy. If it's not too much trouble, can you tell me? P.S. I will delete the discourse one. – yosuga Oct 23 '20 at 12:18
  • 1
    I showed it above: you say `adaptive=false` and give it a dt. – Chris Rackauckas Oct 23 '20 at 16:57
  • Because of a fixed dt, it surely became faster in DifferentialEquations.jl. However, it doesn't match the RK4 one despite using the same dt. I don't know why... Please read the main text for details. – yosuga Oct 24 '20 at 13:18
  • Did you use SimpleDiffEq.SimpleRK4()? Are you measuring compile time? It's nothing but the loop and you can look at the the assembly, so something else is weird about your call. – Chris Rackauckas Oct 24 '20 at 18:26
  • I used only `using DifferentialEquations` and `using OrdinaryDiffEq` for the above results. After your comment, I installed `SimpleDiffEq` and tried to use SimpleDiffEq.SimpleRK4(), but both `solve(prob,SimpleRK4(),dt=1/3,adaptive=false)` and `solve(prob,SimpleDiffEq.SimpleRK4(),dt=1/3,adaptive=false)` showed `no method matching`. Sorry, I don't understand what time `@time` measures, but run times are clearly different. – yosuga Oct 24 '20 at 23:21
  • `@time`, if only ran once, is also measuring compile time. Did you run it twice? – Chris Rackauckas Oct 25 '20 at 08:56
  • I only ran it once. I don't know if it's related, I ran it on jupyter notebook. – yosuga Oct 25 '20 at 12:28
  • Most of what you saw was probably compilation time, which accounts for the difference. This is constant and based on the code length. With compilation time removed are they the same? – Chris Rackauckas Oct 25 '20 at 17:08
  • I understood what you said. I ran `@time` twice to remove its compilation time, but calculation times were not same. `Interpolation: 3rd order Hermite` is related? – yosuga Oct 27 '20 at 00:08