Example using conjugate directions

Loading web-font TeX/Math/Italic

5.3 Example using conjugate directions

  5.3.1 First method, Direct calculus
  5.3.2 Second method, basic Conjugate direction
  5.3.3 Third method. Conjugate gradient
  5.3.4 Fourth method. Conjugate gradient using Fletcher-Reeves
  5.3.5 Fifth method. Conjugate gradient using Polak-Ribiere

This example is solved in number of ways. Given quadratic function $J\left ( x_{1},x_{2}\right ) =\frac{1}{2}x^{T}Ax+bx^{T}$ where $x=\begin{bmatrix} x_{1}\\ x_{2}\end{bmatrix}$ . To ﬁnd $x^{\ast }$ , which minimizes $J\left ( x\right )$ . Let $A=\begin{bmatrix} 4 & 2\\ 2 & 2 \end{bmatrix}$ and $b=\begin{bmatrix} -1\\ 1 \end{bmatrix}$

5.3.1 First method, Direct calculus

$\begin{align*} \nabla J\left ( x\right ) & =0\\ Ax+b & =\begin{bmatrix} 0\\ 0 \end{bmatrix} \\\begin{bmatrix} 4 & 2\\ 2 & 2 \end{bmatrix}\begin{bmatrix} x_{1}\\ x_{2}\end{bmatrix} +\begin{bmatrix} -1\\ 1 \end{bmatrix} & =\begin{bmatrix} 0\\ 0 \end{bmatrix} \\\begin{bmatrix} 4x_{1}+2x_{2}-1\\ 2x_{1}+2x_{2}+1 \end{bmatrix} & =\begin{bmatrix} 0\\ 0 \end{bmatrix} \end{align*}$

Solving gives

$x^{\ast }=\begin{bmatrix} x_{1}\\ x_{2}\end{bmatrix} =\begin{bmatrix} 1\\ -\frac{3}{2}\end{bmatrix}$

5.3.2 Second method, basic Conjugate direction

Since $A$ is of size $n=2$ , then this will converge in $2$ steps using conjugate directions. let $x^{0}=\begin{bmatrix} 0\\ 0 \end{bmatrix}$ . Let ﬁrst direction be

$v^{0}=\begin{bmatrix} 1\\ 0 \end{bmatrix}$ Then

$h_{0}=\frac{-\left ( v^{0}\right ) ^{T}\nabla J\left ( x^{0}\right ) }{\left ( v^{0}\right ) ^{T}Av^{0}}=\frac{-\left ( v^{0}\right ) ^{T}\left ( Ax^{0}+b\right ) }{\left ( v^{0}\right ) ^{T}Av^{0}}=\frac{-\begin{bmatrix} 1 & 0 \end{bmatrix}\begin{bmatrix} -1\\ 1 \end{bmatrix} }{\begin{bmatrix} 1 & 0 \end{bmatrix}\begin{bmatrix} 4 & 2\\ 2 & 2 \end{bmatrix}\begin{bmatrix} 1\\ 0 \end{bmatrix} }=\frac{1}{4}$ Hence

$\begin{align*} x^{1} & =x^{0}+h_{0}v^{0}\\ & =\begin{bmatrix} 0\\ 0 \end{bmatrix} +\frac{1}{4}\begin{bmatrix} 1\\ 0 \end{bmatrix} =\begin{bmatrix} \frac{1}{4}\\ 0 \end{bmatrix} \end{align*}$

Second step. We need to ﬁnd $v^{1}$ . Using conjugate mutual property of $A$ , we solve for $v^{1}$ using

$\begin{align*} \left ( v^{0}\right ) ^{T}Av^{1} & =0\\\begin{bmatrix} 1 & 0 \end{bmatrix}\begin{bmatrix} 4 & 2\\ 2 & 2 \end{bmatrix}\begin{bmatrix} v_{1}\\ v_{2}\end{bmatrix} & =0\\ 4v_{1}+2v_{2} & =0 \end{align*}$

Let $v_{1}=1$ then $v_{2}=-2$ and hence

$v^{1}=\begin{bmatrix} 1\\ -2 \end{bmatrix}$ Now we ﬁnd the next optimal step

$h_{1}=\frac{-\left ( v^{1}\right ) ^{T}\nabla J\left ( x^{1}\right ) }{\left ( v^{1}\right ) ^{T}Av^{1}}=\frac{-\left ( v^{1}\right ) ^{T}\left ( Ax^{1}+b\right ) }{\left ( v^{1}\right ) ^{T}Av^{1}}=\frac{-\begin{bmatrix} 1 & -2 \end{bmatrix} \left ( \begin{bmatrix} 4 & 2\\ 2 & 2 \end{bmatrix}\begin{bmatrix} \frac{1}{4}\\ 0 \end{bmatrix} +\begin{bmatrix} -1\\ 1 \end{bmatrix} \right ) }{\begin{bmatrix} 1 & -2 \end{bmatrix}\begin{bmatrix} 4 & 2\\ 2 & 2 \end{bmatrix}\begin{bmatrix} 1\\ -2 \end{bmatrix} }=\frac{3}{4}$ Hence

$\begin{align*} x^{2} & =x^{1}+h_{1}v^{1}\\ & =\begin{bmatrix} \frac{1}{4}\\ 0 \end{bmatrix} +\frac{3}{4}\begin{bmatrix} 1\\ -2 \end{bmatrix} \\ & =\begin{bmatrix} 1\\ -\frac{3}{2}\end{bmatrix} \end{align*}$

Which is $x^{\ast }$ that we found in ﬁrst method. Using $n=2$ steps as expected. In implementation, we will have to check we converged by looking at $\nabla J\left ( x^{2}\right )$ which will be

$\begin{align*} \nabla J\left ( x^{2}\right ) & =Ax^{2}+b\\ & =\begin{bmatrix} 4 & 2\\ 2 & 2 \end{bmatrix}\begin{bmatrix} 1\\ -\frac{3}{2}\end{bmatrix} +\begin{bmatrix} -1\\ 1 \end{bmatrix} \\ & =\begin{bmatrix} 0\\ 0 \end{bmatrix} \end{align*}$

As expected.

5.3.3 Third method. Conjugate gradient

The diﬀerence here is that we ﬁnd $v^{i}$ on the ﬂy after each step. Unlike the conjugate direction method, where $v^{i}$ are all pre-computed. Let $v^{0}=\nabla \left ( J\left ( x^{0}\right ) \right ) =\begin{bmatrix} -1\\ 1 \end{bmatrix}$ . In this method, we always pick $v^{0}=\nabla \left ( J\left ( x^{0}\right ) \right )$ , where $x^{0}$ is the starting guess vector. First step

$h_{0}=\frac{-\left ( v^{0}\right ) ^{T}\nabla J\left ( x^{0}\right ) }{\left ( v^{0}\right ) ^{T}Av^{0}}=\frac{-\left ( v^{0}\right ) ^{T}\left ( Ax^{0}+b\right ) }{\left ( v^{0}\right ) ^{T}Av^{0}}=\frac{-\begin{bmatrix} -1 & 1 \end{bmatrix}\begin{bmatrix} -1\\ 1 \end{bmatrix} }{\begin{bmatrix} -1 & 1 \end{bmatrix}\begin{bmatrix} 4 & 2\\ 2 & 2 \end{bmatrix}\begin{bmatrix} -1\\ 1 \end{bmatrix} }=\frac{-2}{2}=-1$ Hence

$\begin{align*} x^{1} & =x^{0}+h_{0}v^{0}\\ & =\begin{bmatrix} 0\\ 0 \end{bmatrix} -1\begin{bmatrix} -1\\ 1 \end{bmatrix} =\begin{bmatrix} 1\\ -1 \end{bmatrix} \end{align*}$

Now we ﬁnd the mutual conjugate $v^{1}$ as follows

$\begin{align*} \beta _{0} & =\frac{\left ( \nabla J\left ( x^{1}\right ) \right ) ^{T}Av^{0}}{\left ( v^{0}\right ) ^{T}Av^{0}}=\frac{\left ( Ax^{1}+b\right ) \left ( Av^{0}\right ) }{\left ( v^{0}\right ) ^{T}Av^{0}}=\frac{\left ( \begin{bmatrix} 4 & 2\\ 2 & 2 \end{bmatrix}\begin{bmatrix} 1\\ -1 \end{bmatrix} +\begin{bmatrix} -1\\ 1 \end{bmatrix} \right ) ^{T}\left ( \begin{bmatrix} 4 & 2\\ 2 & 2 \end{bmatrix}\begin{bmatrix} -1\\ 1 \end{bmatrix} \right ) }{\begin{bmatrix} -1 & 1 \end{bmatrix}\begin{bmatrix} 4 & 2\\ 2 & 2 \end{bmatrix}\begin{bmatrix} -1\\ 1 \end{bmatrix} }\\ & =\frac{\begin{bmatrix} 1\\ 1 \end{bmatrix} ^{T}\begin{bmatrix} -2\\ 0 \end{bmatrix} }{2}=\frac{-2}{2}=-1 \end{align*}$

Hence

$\begin{align*} v^{1} & =-\nabla J\left ( x^{1}\right ) +\beta _{0}v^{0}\\ & =-\left ( \begin{bmatrix} 4 & 2\\ 2 & 2 \end{bmatrix}\begin{bmatrix} 1\\ -1 \end{bmatrix} +\begin{bmatrix} -1\\ 1 \end{bmatrix} \right ) -\left ( 1\right ) \begin{bmatrix} -1\\ 1 \end{bmatrix} \\ & =\begin{bmatrix} 0\\ -2 \end{bmatrix} \end{align*}$

Now that we found $v^{1}$ , we repeat the process.

$h_{1}=\frac{-\left ( v^{1}\right ) ^{T}\nabla J\left ( x^{1}\right ) }{\left ( v^{1}\right ) ^{T}Av^{1}}=\frac{-\left ( v^{1}\right ) ^{T}\left ( Ax^{1}+b\right ) }{\left ( v^{1}\right ) ^{T}Av^{1}}=\frac{-\begin{bmatrix} 0 & -2 \end{bmatrix} \left ( \begin{bmatrix} 4 & 2\\ 2 & 2 \end{bmatrix}\begin{bmatrix} 1\\ -1 \end{bmatrix} +\begin{bmatrix} -1\\ 1 \end{bmatrix} \right ) }{\begin{bmatrix} 0 & -2 \end{bmatrix}\begin{bmatrix} 4 & 2\\ 2 & 2 \end{bmatrix}\begin{bmatrix} 0\\ -2 \end{bmatrix} }=\frac{2}{8}=\frac{1}{4}$ Hence

$\begin{align*} x^{2} & =x^{1}+h_{1}v^{1}\\ & =\begin{bmatrix} 1\\ -1 \end{bmatrix} +\left ( \frac{1}{4}\right ) \begin{bmatrix} 0\\ -2 \end{bmatrix} \\ & =\begin{bmatrix} 1\\ -\frac{3}{2}\end{bmatrix} \end{align*}$

Which is the same as with conjugate direction method. Converged in $2$ steps also.

5.3.4 Fourth method. Conjugate gradient using Fletcher-Reeves

In this method

$\beta _{k}=\frac{\nabla J\left ( u^{k+1}\right ) ^{T}\nabla J\left ( u^{k+1}\right ) }{\nabla J\left ( u^{k}\right ) ^{T}\nabla J\left ( u^{k}\right ) }=\frac{\left \Vert \nabla J\left ( u^{k+1}\right ) \right \Vert ^{2}}{\left \Vert \nabla J\left ( u^{k}\right ) \right \Vert ^{2}}$ We also start here with

$v^{0}=\nabla J\left ( u^{0}\right ) =\begin{bmatrix} -1\\ 1 \end{bmatrix}$ in this example.

$h_{0}=\frac{-\left ( v^{0}\right ) ^{T}\nabla J\left ( x^{0}\right ) }{\left ( v^{0}\right ) ^{T}Av^{0}}=\frac{-\left ( v^{0}\right ) ^{T}\left ( Ax^{0}+b\right ) }{\left ( v^{0}\right ) ^{T}Av^{0}}=\frac{-\begin{bmatrix} -1 & 1 \end{bmatrix}\begin{bmatrix} -1\\ 1 \end{bmatrix} }{\begin{bmatrix} -1 & 1 \end{bmatrix}\begin{bmatrix} 4 & 2\\ 2 & 2 \end{bmatrix}\begin{bmatrix} -1\\ 1 \end{bmatrix} }=\frac{-2}{2}=-1$ Hence

$\begin{align*} x^{1} & =x^{0}+h_{0}v^{0}\\ & =\begin{bmatrix} 0\\ 0 \end{bmatrix} -1\begin{bmatrix} -1\\ 1 \end{bmatrix} =\begin{bmatrix} 1\\ -1 \end{bmatrix} \end{align*}$

Now ﬁnd the mutual conjugate $v^{1}$ as follows, using Fletcher-Reeves formula

$\beta _{0}=\frac{\left \Vert \nabla J\left ( u^{1}\right ) \right \Vert ^{2}}{\left \Vert \nabla J\left ( u^{0}\right ) \right \Vert ^{2}}=\frac{\left \Vert Ax^{1}+b\right \Vert ^{2}}{\left \Vert Ax^{0}+b\right \Vert ^{2}}=\frac{\left \Vert \left ( \begin{bmatrix} 4 & 2\\ 2 & 2 \end{bmatrix}\begin{bmatrix} 1\\ -1 \end{bmatrix} +\begin{bmatrix} -1\\ 1 \end{bmatrix} \right ) \right \Vert ^{2}}{\left \Vert \left ( \begin{bmatrix} 4 & 2\\ 2 & 2 \end{bmatrix}\begin{bmatrix} 0\\ 0 \end{bmatrix} +\begin{bmatrix} -1\\ 1 \end{bmatrix} \right ) \right \Vert ^{2}}=\frac{\left \Vert \begin{bmatrix} 1\\ 1 \end{bmatrix} \right \Vert ^{2}}{\left \Vert \begin{bmatrix} -1\\ 1 \end{bmatrix} \right \Vert ^{2}}=\frac{\left ( \sqrt{2}\right ) ^{2}}{\left ( \sqrt{2}\right ) ^{2}}=1$

$\begin{align*} v^{1} & =-\nabla J\left ( x^{1}\right ) +\beta _{0}v^{0}\\ & =-\left ( \begin{bmatrix} 4 & 2\\ 2 & 2 \end{bmatrix}\begin{bmatrix} 1\\ -1 \end{bmatrix} +\begin{bmatrix} -1\\ 1 \end{bmatrix} \right ) +\left ( 1\right ) \begin{bmatrix} -1\\ 1 \end{bmatrix} \\ & =\begin{bmatrix} -2\\ 0 \end{bmatrix} \end{align*}$

Now that we found $v^{1}$ , we repeat the process.

$h_{1}=\frac{-\left ( v^{1}\right ) ^{T}\nabla J\left ( x^{1}\right ) }{\left ( v^{1}\right ) ^{T}Av^{1}}=\frac{-\left ( v^{1}\right ) ^{T}\left ( Ax^{1}+b\right ) }{\left ( v^{1}\right ) ^{T}Av^{1}}=\frac{-\begin{bmatrix} -2 & 0 \end{bmatrix} \left ( \begin{bmatrix} 4 & 2\\ 2 & 2 \end{bmatrix}\begin{bmatrix} 1\\ -1 \end{bmatrix} +\begin{bmatrix} -1\\ 1 \end{bmatrix} \right ) }{\begin{bmatrix} -2 & 0 \end{bmatrix}\begin{bmatrix} 4 & 2\\ 2 & 2 \end{bmatrix}\begin{bmatrix} 0\\ -2 \end{bmatrix} }=\frac{2}{8}=\frac{1}{4}$ Hence

Which is the same as with conjugate direction method. It converges in $2$ steps also.

5.3.5 Fifth method. Conjugate gradient using Polak-Ribiere

In this method

$\beta _{k}=\frac{\nabla J\left ( u^{k+1}\right ) ^{T}\left ( \nabla J\left ( u^{k+1}\right ) -\nabla J\left ( u^{k}\right ) \right ) }{\nabla J\left ( u^{k}\right ) ^{T}\nabla J\left ( u^{k}\right ) }$ We also start here with

$v^{0}=\nabla J\left ( u^{0}\right ) =\begin{bmatrix} -1\\ 1 \end{bmatrix}$ in this example.

$h_{0}=\frac{-\left ( v^{0}\right ) ^{T}\nabla J\left ( x^{0}\right ) }{\left ( v^{0}\right ) ^{T}Av^{0}}=\frac{-\left ( v^{0}\right ) ^{T}\left ( Ax^{0}+b\right ) }{\left ( v^{0}\right ) ^{T}Av^{0}}=\frac{-\begin{bmatrix} -1 & 1 \end{bmatrix}\begin{bmatrix} -1\\ 1 \end{bmatrix} }{\begin{bmatrix} -1 & 1 \end{bmatrix}\begin{bmatrix} 4 & 2\\ 2 & 2 \end{bmatrix}\begin{bmatrix} -1\\ 1 \end{bmatrix} }=\frac{-2}{2}=-1$ Hence

$\begin{align*} x^{1} & =x^{0}+h_{0}v^{0}\\ & =\begin{bmatrix} 0\\ 0 \end{bmatrix} -1\begin{bmatrix} -1\\ 1 \end{bmatrix} =\begin{bmatrix} 1\\ -1 \end{bmatrix} \end{align*}$

Now we ﬁnd the mutual conjugate $v^{1}$ direction as follows, using Polak-Ribiere formula

$\beta _{0}=\frac{\nabla J\left ( u^{1}\right ) ^{T}\left ( \nabla J\left ( u^{1}\right ) -\nabla J\left ( u^{0}\right ) \right ) }{\nabla J\left ( u^{0}\right ) ^{T}\nabla J\left ( u^{0}\right ) }$

But

$\begin{align*} \nabla J\left ( u^{1}\right ) & =\begin{bmatrix} 4 & 2\\ 2 & 2 \end{bmatrix}\begin{bmatrix} 1\\ -1 \end{bmatrix} +\begin{bmatrix} -1\\ 1 \end{bmatrix} =\begin{bmatrix} 1\\ 1 \end{bmatrix} \\ \nabla J\left ( u^{0}\right ) & =\begin{bmatrix} 4 & 2\\ 2 & 2 \end{bmatrix}\begin{bmatrix} 0\\ 0 \end{bmatrix} +\begin{bmatrix} -1\\ 1 \end{bmatrix} =\begin{bmatrix} -1\\ 1 \end{bmatrix} \end{align*}$

Hence

$\beta _{0}=\frac{\begin{bmatrix} 1 & 1 \end{bmatrix} \left ( \begin{bmatrix} 1\\ 1 \end{bmatrix} -\begin{bmatrix} -1\\ 1 \end{bmatrix} \right ) }{\begin{bmatrix} -1 & 1 \end{bmatrix}\begin{bmatrix} -1\\ 1 \end{bmatrix} }=\frac{2}{2}=1$ Hence

$\begin{align*} v^{1} & =-\nabla J\left ( x^{1}\right ) +\beta _{0}v^{0}\\ & =-\left ( \begin{bmatrix} 4 & 2\\ 2 & 2 \end{bmatrix}\begin{bmatrix} 1\\ -1 \end{bmatrix} +\begin{bmatrix} -1\\ 1 \end{bmatrix} \right ) +\left ( 1\right ) \begin{bmatrix} -1\\ 1 \end{bmatrix} \\ & =\begin{bmatrix} -2\\ 0 \end{bmatrix} \end{align*}$

Now that we found $v^{1}$ , we repeat the process.

$h_{1}=\frac{-\left ( v^{1}\right ) ^{T}\nabla J\left ( x^{1}\right ) }{\left ( v^{1}\right ) ^{T}Av^{1}}=\frac{-\left ( v^{1}\right ) ^{T}\left ( Ax^{1}+b\right ) }{\left ( v^{1}\right ) ^{T}Av^{1}}=\frac{-\begin{bmatrix} -2 & 0 \end{bmatrix} \left ( \begin{bmatrix} 4 & 2\\ 2 & 2 \end{bmatrix}\begin{bmatrix} 1\\ -1 \end{bmatrix} +\begin{bmatrix} -1\\ 1 \end{bmatrix} \right ) }{\begin{bmatrix} -2 & 0 \end{bmatrix}\begin{bmatrix} 4 & 2\\ 2 & 2 \end{bmatrix}\begin{bmatrix} 0\\ -2 \end{bmatrix} }=\frac{2}{8}=\frac{1}{4}$ Hence

Which is the same as with conjugate direction method. Converges in $2$ steps also as expected

[next] [prev] [prev-tail] [front] [up]