Initial step length guess

Initial step length guess

Some of these routines are tightly integrated with Optim.

Provide static initial step length.

Keyword alpha corresponds to static step length, default is 1.0. If keyword scaled = true, then the initial step length is scaled with the l_2 norm of the step direction.

source

Use previous step length as initial guess, within the bounds [alphamin, alphamax]

If state.alpha is NaN, then return fallback value is.alpha

source

Quadratic interpolation for initial step length guess.

This is meant for methods that do not produce well-scaled search directions, such as Gradient Descent and (variations of) Conjugate Gradient methods. See the discussion around Nocedal and Wright, 2nd ed, (3.60).

This procedure have several arguments, with the following defaults.

  • α0 = 1.0. The initial step size at the first iteration.
  • αmin = 1e-12. The minimum initial step size. (Default arbitrary).
  • αmax = 1.0. The maximum initial step size.
  • ρ = 0.25. Maximum decrease from previous iteration, αinit ≥ α_{k-1}. (Default arbitrary).
  • snap2one = (0.75, Inf). Set all values within this (closed) interval to 1.0. (Default arbitrary).

If αmax ≠ 1.0, then you should consider to ensure that snap2one[2] < αmax.

source

Constant first-order change approximation to determine initial step length.

** This requires that the optimization algorithm stores dphi0 from the previous iteration ** (dphi0previous = real(dot(∇f{k-1}, s_{k-1})), where s is the step direction.

This is meant for methods that do not produce well-scaled search directions, such as Gradient Descent and (variations of) Conjugate Gradient methods. See the discussion in Nocedal and Wright, 2nd ed, p. 59 on "Initial Step Length"

This procedure have several arguments, with the following defaults.

  • α0 = 1.0. The initial step size at the first iteration.
  • αmin = 1e-12. The minimum initial step size. (Default arbitrary).
  • αmax = 1.0. The maximum initial step size.
  • ρ = 0.25. Maximum decrease from previous iteration, αinit ≥ α_{k-1}. (Default arbitrary).
  • snap2one = (0.75, Inf). Set all values within this (closed) interval to 1.0. (Default arbitrary).

If αmax ≠ 1.0, then you should consider to ensure that snap2one[2] < αmax.

Initial step size algorithm from W. W. Hager and H. Zhang (2006) Algorithm 851: CG_DESCENT, a conjugate gradient method with guaranteed descent. ACM Transactions on Mathematical Software 32: 113–137.

If α0 is NaN, then procedure I0 is called at the first iteration, otherwise, we select according to procedure I1-2, with starting value α0.

source