Nonlinear constrained optimization

Tip

This example is also available as a Jupyter notebook: ipnewton_basics.ipynb

The nonlinear constrained optimization interface in Optim assumes that the user can write the optimization problem in the following way.

\[\min_{x\in\mathbb{R}^n} f(x) \quad \text{such that}\\ l_x \leq \phantom{c(}x\phantom{)} \leq u_x \\ l_c \leq c(x) \leq u_c.\]

For equality constraints on $x_j$ or $c(x)_j$ you set those particular entries of bounds to be equal, $l_j=u_j$. Likewise, setting $l_j=-\infty$ or $u_j=\infty$ means that the constraint is unbounded from below or above respectively.

Constrained optimization with IPNewton

We will go through examples on how to use the constraints interface with the interior-point Newton optimization algorithm IPNewton.

Throughout these examples we work with the standard Rosenbrock function. The objective and its derivatives are given by

fun(x) =  (1.0 - x[1])^2 + 100.0 * (x[2] - x[1]^2)^2

function fun_grad!(g, x)
g[1] = -2.0 * (1.0 - x[1]) - 400.0 * (x[2] - x[1]^2) * x[1]
g[2] = 200.0 * (x[2] - x[1]^2)
end

function fun_hess!(h, x)
h[1, 1] = 2.0 - 400.0 * x[2] + 1200.0 * x[1]^2
h[1, 2] = -400.0 * x[1]
h[2, 1] = -400.0 * x[1]
h[2, 2] = 200.0
end;

Optimization interface

To solve a constrained optimization problem we call the optimize method

optimize(d::AbstractObjective, constraints::AbstractConstraints, initial_x::Tx, method::ConstrainedOptimizer, options::Options)

We can create instances of AbstractObjective and AbstractConstraints using the types TwiceDifferentiable and TwiceDifferentiableConstraints from the package NLSolversBase.jl.

Box minimization

We want to optimize the Rosenbrock function in the box $-0.5 \leq x \leq 0.5$, starting from the point $x_0=(0,0)$. Box constraints are defined using, for example, TwiceDifferentiableConstraints(lx, ux).

x0 = [0.0, 0.0]
df = TwiceDifferentiable(fun, fun_grad!, fun_hess!, x0)

lx = [-0.5, -0.5]; ux = [0.5, 0.5]
dfc = TwiceDifferentiableConstraints(lx, ux)

res = optimize(df, dfc, x0, IPNewton())
 * Status: success

 * Candidate solution
    Final objective value:     2.500000e-01

 * Found with
    Algorithm:     Interior Point Newton

 * Convergence measures
    |x - x'|               = 4.39e-10 ≰ 0.0e+00
    |x - x'|/|x'|          = 8.79e-10 ≰ 0.0e+00
    |f(x) - f(x')|         = 0.00e+00 ≤ 0.0e+00
    |f(x) - f(x')|/|f(x')| = 0.00e+00 ≤ 0.0e+00
    |g(x)|                 = 1.00e+00 ≰ 1.0e-08

 * Work counters
    Seconds run:   0  (vs limit Inf)
    Iterations:    43
    f(x) calls:    68
    ∇f(x) calls:   68

Like the rest of Optim, you can also use autodiff=:forward and just pass in fun.

If we only want to set lower bounds, use ux = fill(Inf, 2)

ux = fill(Inf, 2)
dfc = TwiceDifferentiableConstraints(lx, ux)

clear!(df)
res = optimize(df, dfc, x0, IPNewton())
 * Status: success (objective increased between iterations)

 * Candidate solution
    Final objective value:     7.987239e-20

 * Found with
    Algorithm:     Interior Point Newton

 * Convergence measures
    |x - x'|               = 3.54e-10 ≰ 0.0e+00
    |x - x'|/|x'|          = 3.54e-10 ≰ 0.0e+00
    |f(x) - f(x')|         = 2.40e-19 ≰ 0.0e+00
    |f(x) - f(x')|/|f(x')| = 3.00e+00 ≰ 0.0e+00
    |g(x)|                 = 8.83e-09 ≤ 1.0e-08

 * Work counters
    Seconds run:   0  (vs limit Inf)
    Iterations:    35
    f(x) calls:    63
    ∇f(x) calls:   63

Defining "unconstrained" problems

An unconstrained problem can be defined either by passing Inf bounds or empty arrays. Note that we must pass the correct type information to the empty lx and ux

lx = fill(-Inf, 2); ux = fill(Inf, 2)
dfc = TwiceDifferentiableConstraints(lx, ux)

clear!(df)
res = optimize(df, dfc, x0, IPNewton())

lx = Float64[]; ux = Float64[]
dfc = TwiceDifferentiableConstraints(lx, ux)

clear!(df)
res = optimize(df, dfc, x0, IPNewton())
 * Status: success

 * Candidate solution
    Final objective value:     5.998937e-19

 * Found with
    Algorithm:     Interior Point Newton

 * Convergence measures
    |x - x'|               = 1.50e-09 ≰ 0.0e+00
    |x - x'|/|x'|          = 1.50e-09 ≰ 0.0e+00
    |f(x) - f(x')|         = 1.80e-18 ≰ 0.0e+00
    |f(x) - f(x')|/|f(x')| = 3.00e+00 ≰ 0.0e+00
    |g(x)|                 = 7.92e-09 ≤ 1.0e-08

 * Work counters
    Seconds run:   0  (vs limit Inf)
    Iterations:    34
    f(x) calls:    63
    ∇f(x) calls:   63

Generic nonlinear constraints

We now consider the Rosenbrock problem with a constraint on

\[ c(x)_1 = x_1^2 + x_2^2.\]

We pass the information about the constraints to optimize by defining a vector function c(x) and its Jacobian J(x).

The Hessian information is treated differently, by considering the Lagrangian of the corresponding slack-variable transformed optimization problem. This is similar to how the CUTEst library works. Let $H_j(x)$ represent the Hessian of the $j$th component $c(x)_j$ of the generic constraints. and $\lambda_j$ the corresponding dual variable in the Lagrangian. Then we want the constraint object to add the values of $H_j(x)$ to the Hessian of the objective, weighted by $\lambda_j$.

The Julian form for the supplied function $c(x)$ and the derivative information is then added in the following way.

con_c!(c, x) = (c[1] = x[1]^2 + x[2]^2; c)
function con_jacobian!(J, x)
    J[1,1] = 2*x[1]
    J[1,2] = 2*x[2]
    J
end
function con_h!(h, x, λ)
    h[1,1] += λ[1]*2
    h[2,2] += λ[1]*2
end;

Note that con_h! adds the λ-weighted Hessian value of each element of c(x) to the Hessian of fun.

We can then optimize the Rosenbrock function inside the ball of radius $0.5$.

lx = Float64[]; ux = Float64[]
lc = [-Inf]; uc = [0.5^2]
dfc = TwiceDifferentiableConstraints(con_c!, con_jacobian!, con_h!,
                                     lx, ux, lc, uc)
res = optimize(df, dfc, x0, IPNewton())
 * Status: success

 * Candidate solution
    Final objective value:     2.966216e-01

 * Found with
    Algorithm:     Interior Point Newton

 * Convergence measures
    |x - x'|               = 0.00e+00 ≤ 0.0e+00
    |x - x'|/|x'|          = 0.00e+00 ≤ 0.0e+00
    |f(x) - f(x')|         = 0.00e+00 ≤ 0.0e+00
    |f(x) - f(x')|/|f(x')| = 0.00e+00 ≤ 0.0e+00
    |g(x)|                 = 7.71e-01 ≰ 1.0e-08

 * Work counters
    Seconds run:   0  (vs limit Inf)
    Iterations:    28
    f(x) calls:    109
    ∇f(x) calls:   109

We can add a lower bound on the constraint, and thus optimize the objective on the annulus with inner and outer radii $0.1$ and $0.5$ respectively.

lc = [0.1^2]
dfc = TwiceDifferentiableConstraints(con_c!, con_jacobian!, con_h!,
                                     lx, ux, lc, uc)
res = optimize(df, dfc, x0, IPNewton())
 * Status: success

 * Candidate solution
    Final objective value:     2.966216e-01

 * Found with
    Algorithm:     Interior Point Newton

 * Convergence measures
    |x - x'|               = 0.00e+00 ≤ 0.0e+00
    |x - x'|/|x'|          = 0.00e+00 ≤ 0.0e+00
    |f(x) - f(x')|         = 0.00e+00 ≤ 0.0e+00
    |f(x) - f(x')|/|f(x')| = 0.00e+00 ≤ 0.0e+00
    |g(x)|                 = 7.71e-01 ≰ 1.0e-08

 * Work counters
    Seconds run:   0  (vs limit Inf)
    Iterations:    34
    f(x) calls:    158
    ∇f(x) calls:   158

Note that the algorithm warns that the Initial guess is not an interior point. IPNewton can often handle this, however, if the initial guess is such that c(x) = u_c, then the algorithm currently fails. We may fix this in the future.

Multiple constraints

The following example illustrates how to add an additional constraint. In particular, we add a constraint function

\[ c(x)_2 = x_2\sin(x_1)-x_1\]

function con2_c!(c, x)
    c[1] = x[1]^2 + x[2]^2     ## First constraint
    c[2] = x[2]*sin(x[1])-x[1] ## Second constraint
    c
end
function con2_jacobian!(J, x)
    # First constraint
    J[1,1] = 2*x[1]
    J[1,2] = 2*x[2]
    # Second constraint
    J[2,1] = x[2]*cos(x[1])-1.0
    J[2,2] = sin(x[1])
    J
end
function con2_h!(h, x, λ)
    # First constraint
    h[1,1] += λ[1]*2
    h[2,2] += λ[1]*2
    # Second constraint
    h[1,1] += λ[2]*x[2]*-sin(x[1])
    h[1,2] += λ[2]*cos(x[1])
    # Symmetrize h
    h[2,1]  = h[1,2]
    h
end;

We generate the constraint objects and call IPNewton with initial guess $x_0 = (0.25,0.25)$.

x0 = [0.25, 0.25]
lc = [-Inf, 0.0]; uc = [0.5^2, 0.0]
dfc = TwiceDifferentiableConstraints(con2_c!, con2_jacobian!, con2_h!,
                                     lx, ux, lc, uc)
res = optimize(df, dfc, x0, IPNewton())
 * Status: success

 * Candidate solution
    Final objective value:     1.000000e+00

 * Found with
    Algorithm:     Interior Point Newton

 * Convergence measures
    |x - x'|               = 6.90e-10 ≰ 0.0e+00
    |x - x'|/|x'|          = 3.55e+08 ≰ 0.0e+00
    |f(x) - f(x')|         = 1.38e-09 ≰ 0.0e+00
    |f(x) - f(x')|/|f(x')| = 1.38e-09 ≰ 0.0e+00
    |g(x)|                 = 2.00e+00 ≰ 1.0e-08

 * Work counters
    Seconds run:   0  (vs limit Inf)
    Iterations:    29
    f(x) calls:    215
    ∇f(x) calls:   215

Plain Program

Below follows a version of the program without any comments. The file is also available here: ipnewton_basics.jl

using Optim, NLSolversBase #hide
import NLSolversBase: clear! #hide

fun(x) =  (1.0 - x[1])^2 + 100.0 * (x[2] - x[1]^2)^2

function fun_grad!(g, x)
g[1] = -2.0 * (1.0 - x[1]) - 400.0 * (x[2] - x[1]^2) * x[1]
g[2] = 200.0 * (x[2] - x[1]^2)
end

function fun_hess!(h, x)
h[1, 1] = 2.0 - 400.0 * x[2] + 1200.0 * x[1]^2
h[1, 2] = -400.0 * x[1]
h[2, 1] = -400.0 * x[1]
h[2, 2] = 200.0
end;

x0 = [0.0, 0.0]
df = TwiceDifferentiable(fun, fun_grad!, fun_hess!, x0)

lx = [-0.5, -0.5]; ux = [0.5, 0.5]
dfc = TwiceDifferentiableConstraints(lx, ux)

res = optimize(df, dfc, x0, IPNewton())

ux = fill(Inf, 2)
dfc = TwiceDifferentiableConstraints(lx, ux)

clear!(df)
res = optimize(df, dfc, x0, IPNewton())

lx = fill(-Inf, 2); ux = fill(Inf, 2)
dfc = TwiceDifferentiableConstraints(lx, ux)

clear!(df)
res = optimize(df, dfc, x0, IPNewton())

lx = Float64[]; ux = Float64[]
dfc = TwiceDifferentiableConstraints(lx, ux)

clear!(df)
res = optimize(df, dfc, x0, IPNewton())

con_c!(c, x) = (c[1] = x[1]^2 + x[2]^2; c)
function con_jacobian!(J, x)
    J[1,1] = 2*x[1]
    J[1,2] = 2*x[2]
    J
end
function con_h!(h, x, λ)
    h[1,1] += λ[1]*2
    h[2,2] += λ[1]*2
end;

lx = Float64[]; ux = Float64[]
lc = [-Inf]; uc = [0.5^2]
dfc = TwiceDifferentiableConstraints(con_c!, con_jacobian!, con_h!,
                                     lx, ux, lc, uc)
res = optimize(df, dfc, x0, IPNewton())

lc = [0.1^2]
dfc = TwiceDifferentiableConstraints(con_c!, con_jacobian!, con_h!,
                                     lx, ux, lc, uc)
res = optimize(df, dfc, x0, IPNewton())

function con2_c!(c, x)
    c[1] = x[1]^2 + x[2]^2     ## First constraint
    c[2] = x[2]*sin(x[1])-x[1] ## Second constraint
    c
end
function con2_jacobian!(J, x)
    # First constraint
    J[1,1] = 2*x[1]
    J[1,2] = 2*x[2]
    # Second constraint
    J[2,1] = x[2]*cos(x[1])-1.0
    J[2,2] = sin(x[1])
    J
end
function con2_h!(h, x, λ)
    # First constraint
    h[1,1] += λ[1]*2
    h[2,2] += λ[1]*2
    # Second constraint
    h[1,1] += λ[2]*x[2]*-sin(x[1])
    h[1,2] += λ[2]*cos(x[1])
    # Symmetrize h
    h[2,1]  = h[1,2]
    h
end;

x0 = [0.25, 0.25]
lc = [-Inf, 0.0]; uc = [0.5^2, 0.0]
dfc = TwiceDifferentiableConstraints(con2_c!, con2_jacobian!, con2_h!,
                                     lx, ux, lc, uc)
res = optimize(df, dfc, x0, IPNewton())

# This file was generated using Literate.jl, https://github.com/fredrikekre/Literate.jl

This page was generated using Literate.jl.