5

I have an implementation of L-BFGS, and would like to call the line-search from LineSearches.jl to compare. However, the documentation is quite sparse and focuses only on the use of Linesearches.jl in the context of Optim.jl. I cannot find any examples of how to use Linesearches.jl indpendently.

  • We are currently working on making it less Optim/NLsolve-centric. Please provide more code here if you want me to show you how, open an issue at LineSearches.jl or visit the Optim.jl gitter channel. – pkofod May 30 '18 at 09:29

1 Answers1

0

I have created an example of how to use the LineSearches algorithms with a custom-made optimizer in the latest documentation.

Note that the example currently requires LineSearches master, but should be available in v6.0.0 shortly.

Here's the full example, in case the links break: (EDIT: Updated with new example code that simplifies the process.)

Using LineSearches without Optim/NLsolve

Say we have written a gradient descent optimization algorithm but would like to experiment with different line search algorithms. The algorithm is implemented as follows.

function gdoptimize(f, g!, fg!, x0::AbstractArray{T}, linesearch,
                    maxiter::Int = 10000,
                    g_rtol::T = sqrt(eps(T)), g_atol::T = eps(T)) where T <: Number
    x = copy(x0)
    gvec = similar(x)
    g!(gvec, x)
    fx = f(x)

    gnorm = norm(gvec)
    gtol = max(g_rtol*gnorm, g_atol)

    # Univariate line search functions
    ϕ(α) = f(x .+ α.*s)
    function dϕ(α)
        g!(gvec, x .+ α.*s)
        return vecdot(gvec, s)
    end
    function ϕdϕ(α)
        phi = fg!(gvec, x .+ α.*s)
        dphi = vecdot(gvec, s)
        return (phi, dphi)
    end

    s = similar(gvec) # Step direction

    iter = 0
    while iter < maxiter && gnorm > gtol
        iter += 1
        s .= -gvec

        dϕ_0 = dot(s, gvec)
        α, fx = linesearch(ϕ, dϕ, ϕdϕ, 1.0, fx, dϕ_0)

        @. x = x + α*s
        g!(gvec, x)
        gnorm = norm(gvec)
    end

    return (fx, x, iter)
end

Note that there are many optimization and line search algorithms that allow the user to evaluate both the objective and the gradient at the same time, for computational efficiency reasons. We have included this functionality in the algorithm as the input function fg!, and even if the Gradient Descent algorithm does not use it explicitly, many of the LineSearches algorithms do.

The Gradient Descent gdoptimize method selects a descent direction and calls the line search algorithm linesearch which returns the step length α and the objective value fx = f(x + α*s).

The functions ϕ and dϕ represent a univariate objective and its derivative, which is used by the line search algorithms. To utilize the fg! function call in the optimizer, some of the line searches require a function ϕdϕ which returns the univariate objective and the derivative at the same time.

Optimizing Rosenbrock

Here is an example to show how we can combine gdoptimize and LineSearches to minimize the Rosenbrock function, which is defined by

f(x) = (1.0 - x[1])^2 + 100.0 * (x[2] - x[1]^2)^2

function g!(gvec, x)
    gvec[1] = -2.0 * (1.0 - x[1]) - 400.0 * (x[2] - x[1]^2) * x[1]
    gvec[2] = 200.0 * (x[2] - x[1]^2)
    gvec
end

function fg!(gvec, x)
    g!(gvec, x)
    f(x)
end

We can now use gdoptimize with BackTracking to optimize the Rosenbrock function from a given initial condition x0.

x0 = [-1., 1.0]

using LineSearches
ls = BackTracking(order=3)
fx_bt3, x_bt3, iter_bt3 = gdoptimize(f, g!, fg!, x0, ls)

Interestingly, the StrongWolfe line search converges in one iteration, whilst all the other algorithms take thousands of iterations. This is just luck due to the particular choice of initial condition

ls = StrongWolfe()
fx_sw, x_sw, iter_sw = gdoptimize(f, g!, fg!, x0, ls)