Numerical Optimization Chapter 3

Line Search Methods¶

As previously stated the calculation of the next step is mathematically written as:

$$x_{k+1} = x_k + \alpha_k p_k$$

where $\alpha_k$ is a positive scalar called the "step length" and $p_K$ is search direction.

Most line search algorithms require $p_K$ to be a descent direction such that $p_k^T \nabla f_k < 0$ because this guarentees some reduction of the objective function.

Step length¶

$\alpha_k$ represents a positive scalar that dictates how far in the direction of the descent to go.

image info

when deciding on the best step length we attempt to find it based on the "Wolfe Condition(s)"

Sufficient Decrease: We decrease the objective function¶

image info

Curvature Condition: The rate of decrease is higher¶

image info

Strong Wolfe Condition: Force $\alpha_k$ to lie in a broad neighborhood of local minimizer or stationary point¶

image info

All wolfe conditions:¶

image info

Proof: there is a value of alpha for a convex objective function that allows the exact minimizer using the gradient descent method:¶

image info

Search Direction¶

Commonly the search direction is defines as $$p_k = -B_K^{-1} \nabla f_k$$ where $B_k$ is either the hessian, or an approximation of the hessian. For simple search directions (like the gradient descent) we represent the Hessian as the identity matrix.

In [ ]:

M	T	W	T	F	S	S
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30