The input is \(x(0)\) the initial starting point and \(J(x)\) itself.
1. |
init |
\(x^{0}=x(0)\), \(k=0\) |
2. |
\(g^{k}\) |
\(= \nabla J(x^{k})\) |
3. |
\(\alpha _{k}\) |
\(= \min _{\alpha }J\left ( x^{k} - \alpha \frac{g^{k}}{\|g^{k}\|} \right ) \) (line search) |
4. | \(x^{k+1}\) | \(= x^{k} - \alpha _{k} \frac{g^{k}}{\|g^{k}\|} \) |
5. | \(k\) | \(= k +1\) |
6. | goto 2 |
|
|
|
|
If the objective function \(J(x)\) is quadratic \(J(x)=x^{T} A x - b^{T} x +c\) then there is no need to do the line search.
The input is \(x(0)\) the initial starting point and \(A,b\). The algorithm becomes
1. |
Init |
\(x^{0}=x(0)\), \(k=0\) |
2. |
\(g^{k}\) |
\(= \nabla J(x^{k}) = A x^{k} - b\) |
3. |
\(\alpha _{k}\) |
\(= \frac{ \left [ g^{k} \right ] ^{T} g^{k} }{ \left [ g^{k} \right ] ^{T} A\, g^{k}} \) |
4. |
\(x^{k+1}\) | \(= x^{k} - \alpha _{k} g^{k} \) |
5. | \(k\) | \(= k +1\) |
6. | goto 2 |
|
|
|
|
For quadratic \(J(x)=x^{T} A x - b^{T} x +c\) the conjugate direction algorithm is as follows.
Input \(x(0)\) starting point, and \(A,b\) and set of \(n\) mutually conjugate vectors \(\{v^{0},v^{1},\dots ,v^{n-1}\}\) with respect to \(A\), where \(n\) is the size of \(A\). In other words, \((v^{i})^{T} A v^{j}=0\) for \(i \neq j\).
These \(v^{i}\) vectors have to be generated before starting the algorithm. With the conjugate gradient (below), these A-conjugate vectors are generated on the fly inside the algorithm as it iterates. This is the main difference between conjugate direction and conjugate gradient.
1. |
init |
\(u^{0}=x(0)\), \(k=0\) |
2. |
\(g^{k}\) |
\(= \nabla J(x^{k}) = A x^{k} - b\) |
3. |
\(\alpha _{k}\) |
\(= \frac{ \left [ g^{k}\right ] ^{T} v^{k}}{ \left [ g^{k} \right ] ^{T} A\, v^{k}} \) |
4. |
\(x^{k+1}\) |
\(= x^{k} - \alpha _{k} v^{k} \) |
5. |
\(k\) | \(= k +1\) |
6. | goto 2 |
|
|
|
|
We see the difference between the above and the steepest descent before it, is in line 3,4. Where now \(v^{k}\) replaces \(g^{k}\) in two places.
Conjugate direction required finding set of \(v\) vectors before starting the algorithm. This algorithm generates these vectors as it runs.
Input \(x(0)\) starting point, and \(A,b\).
1. |
Init |
\(u^{0}=x(0)\), \(k=0\), \(g^{0}=\nabla J(x^{0}) = A x^{0} - b\), \(v^{0}=-g^{0}\) |
2. |
\(\alpha _{k}\) |
\(= \frac{ \left [ g^{k}\right ] ^{T} v^{k}}{ \left [ g^{k} \right ] ^{T} A\, v^{k}} \) |
4. |
\(x^{k+1}\) |
\(= x^{k} + \alpha _{k} v^{k} \) |
5. |
\(g^{k+1}\) |
\(= \nabla J(x^{k+1}) = A x^{k+1} - b\) |
6. |
\(\beta \) | \(= \frac{ \left [ g^{k+1}\right ] ^{T} A v^{k}}{ \left [ v^{k} \right ] ^{T} A v^{k} }\) |
7. | \(v^{k+1}\) | \(= - g^{k+1} + \beta v^{k}\) |
8. | \(k\) | \(= k +1\) |
9. |
goto 2 |
|
|
|
|
If we do not have quadratic function, then we can not use \(A,b\) to generate \(\beta \). The above algorithm becomes using Hestenses-Stiefel
Input \(x(0)\) starting point.
1. |
Init |
\(u^{0}=x(0)\), \(k=0\), \(g^{0}=\nabla J(x^{0})\), \(v^{0}=-g^{0}\) |
2. |
\(\alpha _{k}\) |
\(= \min _{\alpha }J\left ( x^{k} + \alpha v^{k} \right ) \) (line search) |
3. |
\(x^{k+1}\) |
\(= x^{k} + \alpha _{k} v^{k} \) |
4. |
\(g^{k+1}\) |
\(= \nabla J(x^{k+1})\) |
5. |
\(\beta \) | \(= \frac{\left [ g^{k+1}\right ] ^{T} \left [ g^{k+1}-g^{k}\right ] }{\left [ v^{k} \right ] ^{T} \left [ g^{k+1}-g^{k}\right ] }\) |
6. | \(v^{k+1}\) | \(= - g^{k+1} + \beta v^{k}\) |
7. |
\(k\) |
\(= k +1\) |
8. |
goto 2 |
|
|
|
|
If we do not have quadratic function, then we can not use \(A,b\) to generate \(\beta \). The conjugate gradient algorithm becomes using Polak-Ribiere as follows
Input \(x(0)\) starting point.
1. |
Init |
\(u^{0}=x(0)\), \(k=0\), \(g^{0}=\nabla J(x^{0})\), \(v^{0}=-g^{0}\) |
2. |
\(\alpha _{k}\) |
\(= \min _{\alpha }J\left ( x^{k} + \alpha v^{k} \right ) \) (line search) |
3. |
\(x^{k+1}\) |
\(= x^{k} + \alpha _{k} v^{k} \) |
4. |
\(g^{k+1}\) |
\(= \nabla J(x^{k+1})\) |
5. |
\(\beta \) | \(= \frac{\left [ g^{k+1}\right ] ^{T} \left [ g^{k+1}-g^{k}\right ] }{ \left [ g^{k}\right ] ^{T} g^{k} }\) |
6. | \(v^{k+1}\) | \(= - g^{k+1} + \beta v^{k}\) |
7. |
\(k\) |
\(= k +1\) |
8. |
goto 2 |
|
|
|
|
If we do not have quadratic function, then we can not use \(A,b\) to generate \(\beta \). The conjugate gradient algorithm becomes using Fletcher-Reeves as follows
Input \(x(0)\) starting point.
1. |
Init |
\(u^{0}=x(0)\), \(k=0\), \(g^{0}=\nabla J(x^{0})\), \(v^{0}=-g^{0}\) |
2. |
\(\alpha _{k}\) |
\(=\min _{\alpha }J\left ( x^{k}+\alpha v^{k}\right ) \) (line search) |
3. |
\(x^{k+1}\) |
\(=x^{k}+\alpha _{k}v^{k}\) |
4. |
\(g^{k+1}\) |
\(=\nabla J(x^{k+1})\) |
5. |
\(\beta \) | \(=\frac{\left [ g^{k+1}\right ] ^{T}g^{k+1}}{\left [ g^{k}\right ] ^{T}g^{k}}\) |
6. | \(v^{k+1}\) | \(=-g^{k+1}+\beta v^{k}\) |
7. |
\(k\) |
\(=k+1\) |
8. |
goto 2 |
|
|
|
|