Summary of iterative search algorithms

1.	init	\(x^{0}=x(0)\), \(k=0\)
2.	\(g^{k}\)	\(= \nabla J(x^{k})\)
3.	\(\alpha _{k}\)	\(= \min _{\alpha }J\left ( x^{k} - \alpha \frac{g^{k}}{\\|g^{k}\\|} \right ) \) (line search)
4.	\(x^{k+1}\)	\(= x^{k} - \alpha _{k} \frac{g^{k}}{\\|g^{k}\\|} \)
5.	\(k\)	\(= k +1\)
6.	goto 2

5.5.1.2 steepest descent, Quadratic objective function \(J(x)\)

If the objective function \(J(x)\) is quadratic \(J(x)=x^{T} A x - b^{T} x +c\) then there is no need to do the line search.

The input is \(x(0)\) the initial starting point and \(A,b\). The algorithm becomes

1.	Init	\(x^{0}=x(0)\), \(k=0\)
2.	\(g^{k}\)	\(= \nabla J(x^{k}) = A x^{k} - b\)
3.	\(\alpha _{k}\)	\(= \frac{ \left [ g^{k} \right ] ^{T} g^{k} }{ \left [ g^{k} \right ] ^{T} A\, g^{k}} \)
4.	\(x^{k+1}\)	\(= x^{k} - \alpha _{k} g^{k} \)
5.	\(k\)	\(= k +1\)
6.	goto 2

5.5.2 conjugate direction, Quadratic function \(J(x)\)

For quadratic \(J(x)=x^{T} A x - b^{T} x +c\) the conjugate direction algorithm is as follows.

Input \(x(0)\) starting point, and \(A,b\) and set of \(n\) mutually conjugate vectors \(\{v^{0},v^{1},\dots ,v^{n-1}\}\) with respect to \(A\), where \(n\) is the size of \(A\). In other words, \((v^{i})^{T} A v^{j}=0\) for \(i \neq j\).

These \(v^{i}\) vectors have to be generated before starting the algorithm. With the conjugate gradient (below), these A-conjugate vectors are generated on the ﬂy inside the algorithm as it iterates. This is the main diﬀerence between conjugate direction and conjugate gradient.

1.	init	\(u^{0}=x(0)\), \(k=0\)
2.	\(g^{k}\)	\(= \nabla J(x^{k}) = A x^{k} - b\)
3.	\(\alpha _{k}\)	\(= \frac{ \left [ g^{k}\right ] ^{T} v^{k}}{ \left [ g^{k} \right ] ^{T} A\, v^{k}} \)
4.	\(x^{k+1}\)	\(= x^{k} - \alpha _{k} v^{k} \)
5.	\(k\)	\(= k +1\)
6.	goto 2

We see the diﬀerence between the above and the steepest descent before it, is in line 3,4. Where now \(v^{k}\) replaces \(g^{k}\) in two places.

5.5.3 conjugate gradient, Quadratic function \(J(x)\)

Conjugate direction required ﬁnding set of \(v\) vectors before starting the algorithm. This algorithm generates these vectors as it runs.

1.	Init	\(u^{0}=x(0)\), \(k=0\), \(g^{0}=\nabla J(x^{0}) = A x^{0} - b\), \(v^{0}=-g^{0}\)
2.	\(\alpha _{k}\)	\(= \frac{ \left [ g^{k}\right ] ^{T} v^{k}}{ \left [ g^{k} \right ] ^{T} A\, v^{k}} \)
4.	\(x^{k+1}\)	\(= x^{k} + \alpha _{k} v^{k} \)
5.	\(g^{k+1}\)	\(= \nabla J(x^{k+1}) = A x^{k+1} - b\)
6.	\(\beta \)	\(= \frac{ \left [ g^{k+1}\right ] ^{T} A v^{k}}{ \left [ v^{k} \right ] ^{T} A v^{k} }\)
7.	\(v^{k+1}\)	\(= - g^{k+1} + \beta v^{k}\)
8.	\(k\)	\(= k +1\)
9.	goto 2

5.5.4 conjugate gradient, None quadratic function \(J(x)\), Hestenses-Stiefel

If we do not have quadratic function, then we can not use \(A,b\) to generate \(\beta \). The above algorithm becomes using Hestenses-Stiefel

1.	Init	\(u^{0}=x(0)\), \(k=0\), \(g^{0}=\nabla J(x^{0})\), \(v^{0}=-g^{0}\)
2.	\(\alpha _{k}\)	\(= \min _{\alpha }J\left ( x^{k} + \alpha v^{k} \right ) \) (line search)
3.	\(x^{k+1}\)	\(= x^{k} + \alpha _{k} v^{k} \)
4.	\(g^{k+1}\)	\(= \nabla J(x^{k+1})\)
5.	\(\beta \)	\(= \frac{\left [ g^{k+1}\right ] ^{T} \left [ g^{k+1}-g^{k}\right ] }{\left [ v^{k} \right ] ^{T} \left [ g^{k+1}-g^{k}\right ] }\)
6.	\(v^{k+1}\)	\(= - g^{k+1} + \beta v^{k}\)
7.	\(k\)	\(= k +1\)
8.	goto 2

5.5.5 conjugate gradient, None quadratic function \(J(x)\), Polak-Ribiere

If we do not have quadratic function, then we can not use \(A,b\) to generate \(\beta \). The conjugate gradient algorithm becomes using Polak-Ribiere as follows

1.	Init	\(u^{0}=x(0)\), \(k=0\), \(g^{0}=\nabla J(x^{0})\), \(v^{0}=-g^{0}\)
2.	\(\alpha _{k}\)	\(= \min _{\alpha }J\left ( x^{k} + \alpha v^{k} \right ) \) (line search)
3.	\(x^{k+1}\)	\(= x^{k} + \alpha _{k} v^{k} \)
4.	\(g^{k+1}\)	\(= \nabla J(x^{k+1})\)
5.	\(\beta \)	\(= \frac{\left [ g^{k+1}\right ] ^{T} \left [ g^{k+1}-g^{k}\right ] }{ \left [ g^{k}\right ] ^{T} g^{k} }\)
6.	\(v^{k+1}\)	\(= - g^{k+1} + \beta v^{k}\)
7.	\(k\)	\(= k +1\)
8.	goto 2

5.5.6 conjugate gradient, None quadratic function \(J(x)\), Fletcher-Reeves

If we do not have quadratic function, then we can not use \(A,b\) to generate \(\beta \). The conjugate gradient algorithm becomes using Fletcher-Reeves as follows

1.	Init	\(u^{0}=x(0)\), \(k=0\), \(g^{0}=\nabla J(x^{0})\), \(v^{0}=-g^{0}\)
2.	\(\alpha _{k}\)	\(=\min _{\alpha }J\left ( x^{k}+\alpha v^{k}\right ) \) (line search)
3.	\(x^{k+1}\)	\(=x^{k}+\alpha _{k}v^{k}\)
4.	\(g^{k+1}\)	\(=\nabla J(x^{k+1})\)
5.	\(\beta \)	\(=\frac{\left [ g^{k+1}\right ] ^{T}g^{k+1}}{\left [ g^{k}\right ] ^{T}g^{k}}\)
6.	\(v^{k+1}\)	\(=-g^{k+1}+\beta v^{k}\)
7.	\(k\)	\(=k+1\)
8.	goto 2

5.5 Summary of iterative search algorithms