0

I am solving a differential equation whose solution comes up with size (2000,2000000), i.e., size(sol)=(2000,2000000). Now I want to store the time evolution of first 1000 variable to an array. for instance say,

      x=sol[1:1000,:];

and from this I need to calculate a value

    r1= abs.(mean(exp.(x*1im),dims=1)[1,:])[1:end]

Now storing the solution to x and then calculating r1 takes too much time and even something the system crash. Is there any solution to this ?

I am using Atom to run the code and the RAM of my system is 48 GB.

MSA
  • 55
  • 3
  • Your code is creating a large amount of of apparently unnecessary temporary arrays. `[1, :]` and `[1:end]` make full copies of the input array, yet seem pointless. – DNF Nov 08 '22 at 06:35
  • Also, please create a Minimal Reproducible Example, which others can copy and run to test for themselves, for example using random dummy data for `sol`. Right now, we cannot even know what the type of `sol` is. Also, make sure your code is a function, not a script running in global scope. – DNF Nov 08 '22 at 06:41
  • Have you tried setting `saveat` and `save_idxs` to just store less? I don't understand why you'd save a `2000 x 2000000` array if you don't want all of those values. – Chris Rackauckas Nov 08 '22 at 11:21

1 Answers1

3

I am going to assume that the array sol is a Matrix{Float64}, though this would be something that is important to include in your question.

Then I will point out some places where you are creating unnecessary allocations:

x = sol[1:1000,:]
r1 = abs.(mean(exp.(x*1im),dims=1)[1,:])[1:end]

Starting from the innermost:

  1. x*1im creates a full new array of the same size as x.
  2. exp.() creates another full array of the same size as x.
  3. mean creates another array, though that is a reasonable thing to expect.
  4. [1,:] creates a new array the same size as one row of x
  5. [1:end] also creates a copy of the output vector, and this doesn't seem to have any purpose at all.
  6. x = sol[1:1000, :] allocates a large array.

We want to get rid of most of these:

1. (and 2.)\ You could write exp.(x .* 1im) with a dot after x, but it is more efficient to use the cis function, which does the same thing as exp(1im * x)

  1. Broadcasting is nice, but since you want to just average along the columns, there's no need for a temp array, just write mean(cis, x; dims=1) instead, that does not create the temporary.
  2. You can use the vec function to force a vector from a matrix without copying data.
  3. Just remove this.
  4. Try making a view to avoid copying.

Result:

# original
function absmean(sol)
    x = sol[1:1000,:]
    r1 = abs.(mean(exp.(x*1im),dims=1)[1,:])[1:end]
    return r1
end

# new version
function absmean2(sol)
    x = @view sol[1:1000, :]
    r1 = abs.(mean(cis, x; dims=1)) |> vec
    return r1
end

Benchmark:

# making a smaller version to reduce benchmarking time
julia> sol = randn(2000, 2000);

julia> @btime absmean($sol);
  72.293 ms (18 allocations: 76.39 MiB)

julia> @btime absmean2($sol);
  29.985 ms (15 allocations: 47.88 KiB)

Not that much faster, but less than 1/1000th of the memory use. There are many more things you can do to improve performance, but these were the 'low-hanging fruit'.

Final remark. It seems like your solution data are oriented along the rows of sol instead of along the columns. That is a bit strange, since Julia arrays are column-major. It doesn't matter in this particular calculation, but in other cases it could be detrimental to performance.

Edit: I'll just add a version to show how you can get more dramatic speedups. This uses LoopVectorization.jl with multithreading (6 cores, 12 threads). Some re-writing of the code is necessary to get LoopVectorization to accept the code:

using LoopVectorization

function absmean4(sol)
    x = @view sol[1:1000, :]
    r = similar(x, size(x, 2))
    # use @turbo instead of @tturbo for single-threaded
    @tturbo for j in axes(x, 2)  
        s = zero(eltype(x))
        c = zero(eltype(x))
        for i in axes(x, 1)
            (a, b) = sincos(x[i, j])
            s += a
            c += b
        end
        r[j] = sqrt(s^2 + c^2) / size(x, 1)
    end
    return r
end

julia> @btime absmean4($sol);
  1.830 ms (1 allocation: 15.75 KiB)

That's a 40x speedup vs the original, and 5000x less memory use.

If you want to go absolutely all out, you can pre-allocate the output vector r and pass it into your function. Then you will have zero allocations.

DNF
  • 11,584
  • 1
  • 26
  • 40
  • @MSA If it answers your question fully, you can consider 'accepting' the answer. – DNF Nov 09 '22 at 09:52