1 Answers
In optimization, a descent direction is a vector p ∈ R n {\displaystyle \mathbf {p} \in \mathbb {R} ^{n}} that, in the sense below, moves us closer towards a local minimum x ∗ {\displaystyle \mathbf {x} ^{*}} of our objective function f : R n → R {\displaystyle f:\mathbb {R} ^{n}\to \mathbb {R} }.
Suppose we are computing x ∗ {\displaystyle \mathbf {x} ^{*}} by an iterative method, such as line search. We define a descent direction p k ∈ R n {\displaystyle \mathbf {p} _{k}\in \mathbb {R} ^{n}} at the k {\displaystyle k} th iterate to be any p k {\displaystyle \mathbf {p} _{k}} such that ⟨ p k , ∇ f ⟩ < 0 {\displaystyle \langle \mathbf {p} _{k},\nabla f\rangle <0} , where ⟨ , ⟩ {\displaystyle \langle ,\rangle } denotes the inner product. The motivation for such an approach is that small steps along p k {\displaystyle \mathbf {p} _{k}} guarantee that f {\displaystyle \displaystyle f} is reduced, by Taylor's theorem.
Using this definition, the negative of a non-zero gradient is always adescent direction, as ⟨ − ∇ f , ∇ f ⟩ = − ⟨ ∇ f , ∇ f ⟩ < 0 {\displaystyle \langle -\nabla f,\nabla f\rangle =-\langle \nabla f,\nabla f\rangle <0}.
Numerous methods exist to compute descent directions, all with differing merits. For example, one could use gradient descent or the conjugate gradient method.