Loading web-font TeX/Math/Italic

5.3 Example using conjugate directions

  5.3.1 First method, Direct calculus
  5.3.2 Second method, basic Conjugate direction
  5.3.3 Third method. Conjugate gradient
  5.3.4 Fourth method. Conjugate gradient using Fletcher-Reeves
  5.3.5 Fifth method. Conjugate gradient using Polak-Ribiere

This example is solved in number of ways. Given quadratic function J\left ( x_{1},x_{2}\right ) =\frac{1}{2}x^{T}Ax+bx^{T} where x=\begin{bmatrix} x_{1}\\ x_{2}\end{bmatrix} . To find x^{\ast }, which minimizes J\left ( x\right ) . Let A=\begin{bmatrix} 4 & 2\\ 2 & 2 \end{bmatrix} and b=\begin{bmatrix} -1\\ 1 \end{bmatrix}

5.3.1 First method, Direct calculus

\begin{align*} \nabla J\left ( x\right ) & =0\\ Ax+b & =\begin{bmatrix} 0\\ 0 \end{bmatrix} \\\begin{bmatrix} 4 & 2\\ 2 & 2 \end{bmatrix}\begin{bmatrix} x_{1}\\ x_{2}\end{bmatrix} +\begin{bmatrix} -1\\ 1 \end{bmatrix} & =\begin{bmatrix} 0\\ 0 \end{bmatrix} \\\begin{bmatrix} 4x_{1}+2x_{2}-1\\ 2x_{1}+2x_{2}+1 \end{bmatrix} & =\begin{bmatrix} 0\\ 0 \end{bmatrix} \end{align*}

Solving gives x^{\ast }=\begin{bmatrix} x_{1}\\ x_{2}\end{bmatrix} =\begin{bmatrix} 1\\ -\frac{3}{2}\end{bmatrix}

5.3.2 Second method, basic Conjugate direction

Since A is of size n=2, then this will converge in 2 steps using conjugate directions. let x^{0}=\begin{bmatrix} 0\\ 0 \end{bmatrix} . Let first direction be v^{0}=\begin{bmatrix} 1\\ 0 \end{bmatrix}

Then h_{0}=\frac{-\left ( v^{0}\right ) ^{T}\nabla J\left ( x^{0}\right ) }{\left ( v^{0}\right ) ^{T}Av^{0}}=\frac{-\left ( v^{0}\right ) ^{T}\left ( Ax^{0}+b\right ) }{\left ( v^{0}\right ) ^{T}Av^{0}}=\frac{-\begin{bmatrix} 1 & 0 \end{bmatrix}\begin{bmatrix} -1\\ 1 \end{bmatrix} }{\begin{bmatrix} 1 & 0 \end{bmatrix}\begin{bmatrix} 4 & 2\\ 2 & 2 \end{bmatrix}\begin{bmatrix} 1\\ 0 \end{bmatrix} }=\frac{1}{4}
Hence\begin{align*} x^{1} & =x^{0}+h_{0}v^{0}\\ & =\begin{bmatrix} 0\\ 0 \end{bmatrix} +\frac{1}{4}\begin{bmatrix} 1\\ 0 \end{bmatrix} =\begin{bmatrix} \frac{1}{4}\\ 0 \end{bmatrix} \end{align*}

Second step. We need to find v^{1}. Using conjugate mutual property of A, we solve for v^{1} using\begin{align*} \left ( v^{0}\right ) ^{T}Av^{1} & =0\\\begin{bmatrix} 1 & 0 \end{bmatrix}\begin{bmatrix} 4 & 2\\ 2 & 2 \end{bmatrix}\begin{bmatrix} v_{1}\\ v_{2}\end{bmatrix} & =0\\ 4v_{1}+2v_{2} & =0 \end{align*}

Let v_{1}=1 then v_{2}=-2 and hence v^{1}=\begin{bmatrix} 1\\ -2 \end{bmatrix}

Now we find the next optimal step h_{1}=\frac{-\left ( v^{1}\right ) ^{T}\nabla J\left ( x^{1}\right ) }{\left ( v^{1}\right ) ^{T}Av^{1}}=\frac{-\left ( v^{1}\right ) ^{T}\left ( Ax^{1}+b\right ) }{\left ( v^{1}\right ) ^{T}Av^{1}}=\frac{-\begin{bmatrix} 1 & -2 \end{bmatrix} \left ( \begin{bmatrix} 4 & 2\\ 2 & 2 \end{bmatrix}\begin{bmatrix} \frac{1}{4}\\ 0 \end{bmatrix} +\begin{bmatrix} -1\\ 1 \end{bmatrix} \right ) }{\begin{bmatrix} 1 & -2 \end{bmatrix}\begin{bmatrix} 4 & 2\\ 2 & 2 \end{bmatrix}\begin{bmatrix} 1\\ -2 \end{bmatrix} }=\frac{3}{4}
Hence\begin{align*} x^{2} & =x^{1}+h_{1}v^{1}\\ & =\begin{bmatrix} \frac{1}{4}\\ 0 \end{bmatrix} +\frac{3}{4}\begin{bmatrix} 1\\ -2 \end{bmatrix} \\ & =\begin{bmatrix} 1\\ -\frac{3}{2}\end{bmatrix} \end{align*}

Which is x^{\ast } that we found in first method. Using n=2 steps as expected. In implementation, we will have to check we converged by looking at \nabla J\left ( x^{2}\right ) which will be\begin{align*} \nabla J\left ( x^{2}\right ) & =Ax^{2}+b\\ & =\begin{bmatrix} 4 & 2\\ 2 & 2 \end{bmatrix}\begin{bmatrix} 1\\ -\frac{3}{2}\end{bmatrix} +\begin{bmatrix} -1\\ 1 \end{bmatrix} \\ & =\begin{bmatrix} 0\\ 0 \end{bmatrix} \end{align*}

As expected.

5.3.3 Third method. Conjugate gradient

The difference here is that we find v^{i} on the fly after each step. Unlike the conjugate direction method, where v^{i} are all pre-computed. Let v^{0}=\nabla \left ( J\left ( x^{0}\right ) \right ) =\begin{bmatrix} -1\\ 1 \end{bmatrix} . In this method, we always pick v^{0}=\nabla \left ( J\left ( x^{0}\right ) \right ) , where x^{0} is the starting guess vector. First step

h_{0}=\frac{-\left ( v^{0}\right ) ^{T}\nabla J\left ( x^{0}\right ) }{\left ( v^{0}\right ) ^{T}Av^{0}}=\frac{-\left ( v^{0}\right ) ^{T}\left ( Ax^{0}+b\right ) }{\left ( v^{0}\right ) ^{T}Av^{0}}=\frac{-\begin{bmatrix} -1 & 1 \end{bmatrix}\begin{bmatrix} -1\\ 1 \end{bmatrix} }{\begin{bmatrix} -1 & 1 \end{bmatrix}\begin{bmatrix} 4 & 2\\ 2 & 2 \end{bmatrix}\begin{bmatrix} -1\\ 1 \end{bmatrix} }=\frac{-2}{2}=-1

Hence\begin{align*} x^{1} & =x^{0}+h_{0}v^{0}\\ & =\begin{bmatrix} 0\\ 0 \end{bmatrix} -1\begin{bmatrix} -1\\ 1 \end{bmatrix} =\begin{bmatrix} 1\\ -1 \end{bmatrix} \end{align*}

Now we find the mutual conjugate v^{1} as follows\begin{align*} \beta _{0} & =\frac{\left ( \nabla J\left ( x^{1}\right ) \right ) ^{T}Av^{0}}{\left ( v^{0}\right ) ^{T}Av^{0}}=\frac{\left ( Ax^{1}+b\right ) \left ( Av^{0}\right ) }{\left ( v^{0}\right ) ^{T}Av^{0}}=\frac{\left ( \begin{bmatrix} 4 & 2\\ 2 & 2 \end{bmatrix}\begin{bmatrix} 1\\ -1 \end{bmatrix} +\begin{bmatrix} -1\\ 1 \end{bmatrix} \right ) ^{T}\left ( \begin{bmatrix} 4 & 2\\ 2 & 2 \end{bmatrix}\begin{bmatrix} -1\\ 1 \end{bmatrix} \right ) }{\begin{bmatrix} -1 & 1 \end{bmatrix}\begin{bmatrix} 4 & 2\\ 2 & 2 \end{bmatrix}\begin{bmatrix} -1\\ 1 \end{bmatrix} }\\ & =\frac{\begin{bmatrix} 1\\ 1 \end{bmatrix} ^{T}\begin{bmatrix} -2\\ 0 \end{bmatrix} }{2}=\frac{-2}{2}=-1 \end{align*}

Hence\begin{align*} v^{1} & =-\nabla J\left ( x^{1}\right ) +\beta _{0}v^{0}\\ & =-\left ( \begin{bmatrix} 4 & 2\\ 2 & 2 \end{bmatrix}\begin{bmatrix} 1\\ -1 \end{bmatrix} +\begin{bmatrix} -1\\ 1 \end{bmatrix} \right ) -\left ( 1\right ) \begin{bmatrix} -1\\ 1 \end{bmatrix} \\ & =\begin{bmatrix} 0\\ -2 \end{bmatrix} \end{align*}

Now that we found v^{1}, we repeat the process. h_{1}=\frac{-\left ( v^{1}\right ) ^{T}\nabla J\left ( x^{1}\right ) }{\left ( v^{1}\right ) ^{T}Av^{1}}=\frac{-\left ( v^{1}\right ) ^{T}\left ( Ax^{1}+b\right ) }{\left ( v^{1}\right ) ^{T}Av^{1}}=\frac{-\begin{bmatrix} 0 & -2 \end{bmatrix} \left ( \begin{bmatrix} 4 & 2\\ 2 & 2 \end{bmatrix}\begin{bmatrix} 1\\ -1 \end{bmatrix} +\begin{bmatrix} -1\\ 1 \end{bmatrix} \right ) }{\begin{bmatrix} 0 & -2 \end{bmatrix}\begin{bmatrix} 4 & 2\\ 2 & 2 \end{bmatrix}\begin{bmatrix} 0\\ -2 \end{bmatrix} }=\frac{2}{8}=\frac{1}{4}

Hence\begin{align*} x^{2} & =x^{1}+h_{1}v^{1}\\ & =\begin{bmatrix} 1\\ -1 \end{bmatrix} +\left ( \frac{1}{4}\right ) \begin{bmatrix} 0\\ -2 \end{bmatrix} \\ & =\begin{bmatrix} 1\\ -\frac{3}{2}\end{bmatrix} \end{align*}

Which is the same as with conjugate direction method. Converged in 2 steps also.

5.3.4 Fourth method. Conjugate gradient using Fletcher-Reeves

In this method \beta _{k}=\frac{\nabla J\left ( u^{k+1}\right ) ^{T}\nabla J\left ( u^{k+1}\right ) }{\nabla J\left ( u^{k}\right ) ^{T}\nabla J\left ( u^{k}\right ) }=\frac{\left \Vert \nabla J\left ( u^{k+1}\right ) \right \Vert ^{2}}{\left \Vert \nabla J\left ( u^{k}\right ) \right \Vert ^{2}}

We also start here with v^{0}=\nabla J\left ( u^{0}\right ) =\begin{bmatrix} -1\\ 1 \end{bmatrix} in this example. h_{0}=\frac{-\left ( v^{0}\right ) ^{T}\nabla J\left ( x^{0}\right ) }{\left ( v^{0}\right ) ^{T}Av^{0}}=\frac{-\left ( v^{0}\right ) ^{T}\left ( Ax^{0}+b\right ) }{\left ( v^{0}\right ) ^{T}Av^{0}}=\frac{-\begin{bmatrix} -1 & 1 \end{bmatrix}\begin{bmatrix} -1\\ 1 \end{bmatrix} }{\begin{bmatrix} -1 & 1 \end{bmatrix}\begin{bmatrix} 4 & 2\\ 2 & 2 \end{bmatrix}\begin{bmatrix} -1\\ 1 \end{bmatrix} }=\frac{-2}{2}=-1
Hence\begin{align*} x^{1} & =x^{0}+h_{0}v^{0}\\ & =\begin{bmatrix} 0\\ 0 \end{bmatrix} -1\begin{bmatrix} -1\\ 1 \end{bmatrix} =\begin{bmatrix} 1\\ -1 \end{bmatrix} \end{align*}

Now find the mutual conjugate v^{1} as follows, using Fletcher-Reeves formula \beta _{0}=\frac{\left \Vert \nabla J\left ( u^{1}\right ) \right \Vert ^{2}}{\left \Vert \nabla J\left ( u^{0}\right ) \right \Vert ^{2}}=\frac{\left \Vert Ax^{1}+b\right \Vert ^{2}}{\left \Vert Ax^{0}+b\right \Vert ^{2}}=\frac{\left \Vert \left ( \begin{bmatrix} 4 & 2\\ 2 & 2 \end{bmatrix}\begin{bmatrix} 1\\ -1 \end{bmatrix} +\begin{bmatrix} -1\\ 1 \end{bmatrix} \right ) \right \Vert ^{2}}{\left \Vert \left ( \begin{bmatrix} 4 & 2\\ 2 & 2 \end{bmatrix}\begin{bmatrix} 0\\ 0 \end{bmatrix} +\begin{bmatrix} -1\\ 1 \end{bmatrix} \right ) \right \Vert ^{2}}=\frac{\left \Vert \begin{bmatrix} 1\\ 1 \end{bmatrix} \right \Vert ^{2}}{\left \Vert \begin{bmatrix} -1\\ 1 \end{bmatrix} \right \Vert ^{2}}=\frac{\left ( \sqrt{2}\right ) ^{2}}{\left ( \sqrt{2}\right ) ^{2}}=1

\begin{align*} v^{1} & =-\nabla J\left ( x^{1}\right ) +\beta _{0}v^{0}\\ & =-\left ( \begin{bmatrix} 4 & 2\\ 2 & 2 \end{bmatrix}\begin{bmatrix} 1\\ -1 \end{bmatrix} +\begin{bmatrix} -1\\ 1 \end{bmatrix} \right ) +\left ( 1\right ) \begin{bmatrix} -1\\ 1 \end{bmatrix} \\ & =\begin{bmatrix} -2\\ 0 \end{bmatrix} \end{align*}

Now that we found v^{1}, we repeat the process. h_{1}=\frac{-\left ( v^{1}\right ) ^{T}\nabla J\left ( x^{1}\right ) }{\left ( v^{1}\right ) ^{T}Av^{1}}=\frac{-\left ( v^{1}\right ) ^{T}\left ( Ax^{1}+b\right ) }{\left ( v^{1}\right ) ^{T}Av^{1}}=\frac{-\begin{bmatrix} -2 & 0 \end{bmatrix} \left ( \begin{bmatrix} 4 & 2\\ 2 & 2 \end{bmatrix}\begin{bmatrix} 1\\ -1 \end{bmatrix} +\begin{bmatrix} -1\\ 1 \end{bmatrix} \right ) }{\begin{bmatrix} -2 & 0 \end{bmatrix}\begin{bmatrix} 4 & 2\\ 2 & 2 \end{bmatrix}\begin{bmatrix} 0\\ -2 \end{bmatrix} }=\frac{2}{8}=\frac{1}{4}

Hence\begin{align*} x^{2} & =x^{1}+h_{1}v^{1}\\ & =\begin{bmatrix} 1\\ -1 \end{bmatrix} +\left ( \frac{1}{4}\right ) \begin{bmatrix} 0\\ -2 \end{bmatrix} \\ & =\begin{bmatrix} 1\\ -\frac{3}{2}\end{bmatrix} \end{align*}

Which is the same as with conjugate direction method. It converges in 2 steps also.

5.3.5 Fifth method. Conjugate gradient using Polak-Ribiere

In this method \beta _{k}=\frac{\nabla J\left ( u^{k+1}\right ) ^{T}\left ( \nabla J\left ( u^{k+1}\right ) -\nabla J\left ( u^{k}\right ) \right ) }{\nabla J\left ( u^{k}\right ) ^{T}\nabla J\left ( u^{k}\right ) }

We also start here with v^{0}=\nabla J\left ( u^{0}\right ) =\begin{bmatrix} -1\\ 1 \end{bmatrix} in this example. h_{0}=\frac{-\left ( v^{0}\right ) ^{T}\nabla J\left ( x^{0}\right ) }{\left ( v^{0}\right ) ^{T}Av^{0}}=\frac{-\left ( v^{0}\right ) ^{T}\left ( Ax^{0}+b\right ) }{\left ( v^{0}\right ) ^{T}Av^{0}}=\frac{-\begin{bmatrix} -1 & 1 \end{bmatrix}\begin{bmatrix} -1\\ 1 \end{bmatrix} }{\begin{bmatrix} -1 & 1 \end{bmatrix}\begin{bmatrix} 4 & 2\\ 2 & 2 \end{bmatrix}\begin{bmatrix} -1\\ 1 \end{bmatrix} }=\frac{-2}{2}=-1
Hence\begin{align*} x^{1} & =x^{0}+h_{0}v^{0}\\ & =\begin{bmatrix} 0\\ 0 \end{bmatrix} -1\begin{bmatrix} -1\\ 1 \end{bmatrix} =\begin{bmatrix} 1\\ -1 \end{bmatrix} \end{align*}

Now we find the mutual conjugate v^{1} direction as follows, using Polak-Ribiere formula \beta _{0}=\frac{\nabla J\left ( u^{1}\right ) ^{T}\left ( \nabla J\left ( u^{1}\right ) -\nabla J\left ( u^{0}\right ) \right ) }{\nabla J\left ( u^{0}\right ) ^{T}\nabla J\left ( u^{0}\right ) }

But

\begin{align*} \nabla J\left ( u^{1}\right ) & =\begin{bmatrix} 4 & 2\\ 2 & 2 \end{bmatrix}\begin{bmatrix} 1\\ -1 \end{bmatrix} +\begin{bmatrix} -1\\ 1 \end{bmatrix} =\begin{bmatrix} 1\\ 1 \end{bmatrix} \\ \nabla J\left ( u^{0}\right ) & =\begin{bmatrix} 4 & 2\\ 2 & 2 \end{bmatrix}\begin{bmatrix} 0\\ 0 \end{bmatrix} +\begin{bmatrix} -1\\ 1 \end{bmatrix} =\begin{bmatrix} -1\\ 1 \end{bmatrix} \end{align*}

Hence

\beta _{0}=\frac{\begin{bmatrix} 1 & 1 \end{bmatrix} \left ( \begin{bmatrix} 1\\ 1 \end{bmatrix} -\begin{bmatrix} -1\\ 1 \end{bmatrix} \right ) }{\begin{bmatrix} -1 & 1 \end{bmatrix}\begin{bmatrix} -1\\ 1 \end{bmatrix} }=\frac{2}{2}=1

Hence

\begin{align*} v^{1} & =-\nabla J\left ( x^{1}\right ) +\beta _{0}v^{0}\\ & =-\left ( \begin{bmatrix} 4 & 2\\ 2 & 2 \end{bmatrix}\begin{bmatrix} 1\\ -1 \end{bmatrix} +\begin{bmatrix} -1\\ 1 \end{bmatrix} \right ) +\left ( 1\right ) \begin{bmatrix} -1\\ 1 \end{bmatrix} \\ & =\begin{bmatrix} -2\\ 0 \end{bmatrix} \end{align*}

Now that we found v^{1}, we repeat the process. h_{1}=\frac{-\left ( v^{1}\right ) ^{T}\nabla J\left ( x^{1}\right ) }{\left ( v^{1}\right ) ^{T}Av^{1}}=\frac{-\left ( v^{1}\right ) ^{T}\left ( Ax^{1}+b\right ) }{\left ( v^{1}\right ) ^{T}Av^{1}}=\frac{-\begin{bmatrix} -2 & 0 \end{bmatrix} \left ( \begin{bmatrix} 4 & 2\\ 2 & 2 \end{bmatrix}\begin{bmatrix} 1\\ -1 \end{bmatrix} +\begin{bmatrix} -1\\ 1 \end{bmatrix} \right ) }{\begin{bmatrix} -2 & 0 \end{bmatrix}\begin{bmatrix} 4 & 2\\ 2 & 2 \end{bmatrix}\begin{bmatrix} 0\\ -2 \end{bmatrix} }=\frac{2}{8}=\frac{1}{4}

Hence\begin{align*} x^{2} & =x^{1}+h_{1}v^{1}\\ & =\begin{bmatrix} 1\\ -1 \end{bmatrix} +\left ( \frac{1}{4}\right ) \begin{bmatrix} 0\\ -2 \end{bmatrix} \\ & =\begin{bmatrix} 1\\ -\frac{3}{2}\end{bmatrix} \end{align*}

Which is the same as with conjugate direction method. Converges in 2 steps also as expected