Numerical Optimization Chapter 3

Line Search Methods

As previously stated the calculation of the next step is mathematically written as:

$$x_{k+1} = x_k + \alpha_k p_k$$

where $\alpha_k$ is a positive scalar called the "step length" and $p_K$ is search direction.

Most line search algorithms require $p_K$ to be a descent direction such that $p_k^T \nabla f_k < 0$ because this guarentees some reduction of the objective function.

Step length

$\alpha_k$ represents a positive scalar that dictates how far in the direction of the descent to go.

image info

when deciding on the best step length we attempt to find it based on the "Wolfe Condition(s)"

Sufficient Decrease: We decrease the objective function

image info

Curvature Condition: The rate of decrease is higher

image info

Strong Wolfe Condition: Force $\alpha_k$ to lie in a broad neighborhood of local minimizer or stationary point

image info

All wolfe conditions:

image info

Proof: there is a value of alpha for a convex objective function that allows the exact minimizer using the gradient descent method:

image info

Search Direction

Commonly the search direction is defines as $$p_k = -B_K^{-1} \nabla f_k$$ where $B_k$ is either the hessian, or an approximation of the hessian. For simple search directions (like the gradient descent) we represent the Hessian as the identity matrix.

In [ ]:
 
Share this Post