A similarity transformation is Where are square matrices. The goal of similarity
transformation is to find a matrix which has a simpler form than so that we can use in
place of to ease some computational work. Lets set our goal in having be a diagonal matrix
(a general diagonal form is called block diagonal or Jordan form, but here we are just looking
at the case of being a diagonal matrix).
The question becomes: Given , can we find such that is diagonal?
The standard method to show the above is via an algebraic method, which we show
first. Next we show a geometric method that explains similarity transformation
geometrically.
1.1 Derivation of similarity transformation based on algebraic method
Starting with Our goal is to find a real matrix such that it is diagonal. From
the above, by pre multiplying each side by we obtain Now, since our goal is to
make diagonal, let us select the eigenvalues of to be the diagonal of . Now we
write the above in expanded form as follows The above can be written as separate
equations by looking at the column view of matrix multiplication And All the way to
the n column of Each of the above equations is in the form and this is the key
idea.
The above shows that if we take the diagonal elements of to be the eigenvalues of then the
matrix is the (right) eigenvectors of
Hence, if we want the matrix to be diagonal and real, this means the matrix itself must
have some conditions put on it. Specifically, eigenvalues must be all distinct and real. This
happens when is real and symmetric. Therefore, we have the general result that given a
matrix which has distinct and real eigenvalues, only then can we find a similar matrix
to it which is real and diagonal, and these diagonal values are the eigenvalues of
.
1.2 Derivation of similarity transformation based on geometric method
It turns out that this derivation is more complicated than the algebraic approach. It involves
a change of basis matrix (which will be our matrix) and the representation of linear
transformation as a matrix under some basis (which will be our matrix). Let us start from
the beginning.
Given the vector in , it can have many different representations (or coordinates) depending
on which basis are used. There are an infinite number of basis that span , hence
there are infinite representations for the same vector. Consider for example . The
standard basis vectors are , but another basis are , and yet another are , and so
on.
In any linearly independent vectors can be used as basis. Let be the vector
representation in w.r.t. basis , and let be the vector representation w.r.t. basis
.
An important problem in linear algebra is to obtain given and the change of basis matrix
.
This requires finding a matrix representation for the change of basis. Once the matrix is
found, we can write Where is the change of basis matrix which when applied to returns the
coordinates of the vector w.r.t. basis
From (1) we see that given , then to obtain we write Another important problem is that of
linear transformation , where now we apply some transformation on the whole space and we
want to find what happens to the space coordinates (all the position vectors) due to this
transformation.
Consider for example which is a rotation in by some angle . Here, every position vector in
the space is affected by this rotation but the basis remain fixed, and we want to find the new
coordinates of each point in space.
Let be the position vector before applying the linear transformation, and let be the new
position vector. Notice that both vectors are written with respect to the same basis . Similar
to change basis, we want a way to find from , hence we need a way to represent this linear
transformation by a matrix, say with respect to the basis and so we could write the
following
Assume the basis is changed to . Let the representation of the same linear transformation
w.r.t. basis be called , so we write
Hence when the basis changes from to , representation changes from to
Now assume we are given some linear transformation and its representation w.r.t. basis ,
how could we find representation w.r.t. to new basis ?
From (2) we have But from (1) , hence the above becomes Pre-multiplying from left the
above equation by we obtain But since , then above reduces to But from (2A) , hence the
above becomes Therefore Notice the difference between change of basis and linear
transformation. In change of basis, the vector remained fixed in space, but the basis
changed, and we want to find the coordinates of the vector w.r.t the new basis. With linear
transformation, the vector itself is changed from to , but both vectors are expressed w.r.t.
the same basis.
Equation (3) allows us to obtain a new matrix representation of the linear transformation
by expressing the same w.r.t. to different basis. Hence if we are given a representation of
which is not the most optimal representation, we can, by change of basis, obtain a different
representation of the same by using (3). The most optimal matrix representation for linear
transformation is a diagonal matrix.
Equation (3) is called a similarity transformation. To make it easier to compare (3) above
with what we wrote in the previous section when we derived the above using an algebraic
approach, we let , hence (3) is
The matrix is similar to . (i.e. is similar to . Both matrices represent the same linear
transformation applied on the space. We will show below some examples of how
to find given and . But first we show how to obtain matrix representation of
1.2.1 Finding matrix representation of linear transformation
Given a vector and some linear transformation (or a linear operator) that acts on the
vector transforming this vector into another vector according to some prescribed manner .
Examples of such linear transformation are rotation, elongation, compression, and reflection,
and any combination of these operations, but it can not include the translation of the vector,
since translation is not linear.
The question to answer here is how to write down a representation of this linear
transformation? It turns out that can be represented by a matrix of size , and the actual
numbers that go into the matrix, or the shape of the matrix, will depend on which basis in
we choose to represent .
We would like to pick some basis so that the final shape of the matrix is the most simple
shape.
Let us pick a set of basis that span hence Now And since is linear we can rewrite the
above as We see from above that the new transformed vector has the same coordinates as if
we view as the new basis.
Now itself is an application of the linear transformation but now it is being done on each
basis vector . Hence it will cause each specific basis vector to transform in some manner to a
new vector. Whatever this new vector is, we must be able to represent it in terms of the
same basis vectors therefore, we write Where is the coordinate of the vector .
And we do the same for And so on until Or Now plug the above into (1) and
obtain
Hence, if we take the as common factors, we have
Hence, since from (1), then by comparing coefficients of each basis vector we
obtain
Or in matrix form Hence we finally obtain that Let us call the matrix that represents under
basis as
We see from the above that the column of contain the coordinates of the vector . This gives
us a quick way to construct : Apply to each basis vector , and take the resulting vector and
place it in the column of .
We now see that will have a different numerical values if the basis used to span are different
from the ones used to obtain the above
Let use pick some new basis, say . Let the new representation of now be the matrix , then
the column of is found from applying on the new basis Where now is the coordinates of
the the vector which will be different from since in one case we used the basis set and in
the other case we used the basis set . Hence we see that will numerically be different
depending on the basis used to span the space even though the linear transformation itself
did not change.
1.2.2 Finding matrix representation of change of basis
Now we show how to determine , the matrix which when applied to a vector will result in
the vector
Given a vector , it is represented w.r.t. basis as and w.r.t. basis as
But the vector itself is invariant under any change of basis, hence Now write each basis in
terms of the basis , hence we have
Substitute (2) into RHS of (1) we obtain
Factor out the basis from the RHS, we obtain
Hence
Now comparing coefficients of each of the basis we obtain the following result
Or in Matrix form, we write The above gives a relation between the coordinates of the vector
w.r.t. basis (these are the coordinates) to the coordinates of the same vector w.r.t. to basis
(these are the coordinates ). The mapping between these coordinates is the matrix shown
above which we call the matrix. Since the above matrix returns the coordinates of the vector
w.r.t. when it is multiplied by the coordinates of the vector w.r.t. basis , we write
it as to make it clear from which basis to which basis is the conversion taking
place.
Looking at (2) and (3) above, we see that column in is the coordinates of the w.r.t. basis
.
Hence the above gives a method to generate the change of basis matrix . Simply represent
each basis in w.r.t. to basis . Take the result of each such representation and write it as a
column in . We now show few examples to illustrate this.
Example showing how to find change of basis matrix
Given the basis , and new basis , find the change of basis matrix
Column 1 of is the coordinates of w.r.t. basis . i.e. and column 2 of is the coordinates of
w.r.t. basis . i.e. But (4) is Hence and , solving, we obtain and (5) is Hence and ,
solving we obtain , , hence Now we can use the above change of basis matrix to find
coordinates of the vector under given its coordinates under . For example, given , then
1.2.3 Examples of similarity transformation
Example 1
This example from Strang book (Linear Algebra and its applications, 4th edition), but
shown here with little more details.
Consider , and let be the projection onto a line at angle . Let the first basis and let the
second basis .
The first step is to find . The first column of is found by writing w.r.t. basis , and the second
column of is found by writing w.r.t. basis , Hence
Or and . And
Hence and . Hence Now we need to find the representation of w.r.t. basis . The
first column of is the new coordinates of the basis after applying onto it. but
and the second column of is the new coordinates of the basis after applying onto it. but
Hence Therefore
Hence we see that the linear transformation is represented as w.r.t. basis , while it is
represented as w.r.t. basis . Therefore, it will be better to perform all calculations involving
this linear transformation under basis instead of basis
Example 2
Consider , and let be a rotation of space by . Let the first basis and let the second basis be
.
The first step is to find . The first column of is found by writing w.r.t. basis , and the second
column of is found by writing w.r.t. basis , Hence
Or and and
Hence and . Hence Now we need to find the representation of w.r.t. basis . The
first column of is the new coordinates of the basis after applying onto it. but
and the second column of is the new coordinates of the basis after applying onto it. but
Hence
Therefore
Hence we see that the linear transformation is represented as w.r.t. basis , while it is
represented as w.r.t. basis . Therefore the change of basis selected above did not result in
making the representation of any simpler (in fact there was no change in the
representation). This means we need to find a more direct method of finding the basis
under which has the simplest representation. Clearly we can’t just keep trying different basis
to find if has simpler representation under the new basis.
1.2.4 How to find the best basis to simplify the representation of ?
Our goal of simpler representation of is that of a diagonal matrix or as close as possible to
being diagonal matrix (i.e. Jordan form). Given and an infinite number of basis that we
could select to represent under in the hope we can find a new representation
such that it is simpler than , we now ask, how does one go about finding such
basis?
It turns out the if we select the eigenvectors of as the columns of , this will result in
being diagonal or as close to being diagonal as possible (block diagonal or Jordan
form).
Let us apply this to the second example in the previous section. In that example, we had
The eigenvectors of are and , . Therefore
Which is diagonal representation. Hence we see that the linear transformation is represented
as w.r.t. basis
1.3 Summary of similarity transformation
To obtain such that is real and diagonal requires that be real and symmetric. The
eigenvalues of goes into the diagonal of and the eigenvectors of go into the columns of
This is an algebraic view.
Geometrically, is viewed as the matrix representation under some basis of a linear
transformation . And is the matrix representation of the same linear transformation but
now under a new basis and is the matrix that represents the change of basis from to
.
The question then immediately arise: If must be real and symmetric (for to be real
and diagonal), what does this mean in terms of the linear transformation and
change of basis matrix ? This clearly mean that, under some basis , not every
linear transformation represented by can have a similar matrix which is real and
diagonal. Only those linear transformations which result in real and symmetric
representation can have a similar matrix which is real and diagonal. This is shown
in the previous examples, where we saw that defined as rotation by under the
standard basis resulted in and since this is not symmetric, hence we will not be
able to factor this matrix using similarity transformation to a diagonal matrix,
no matter which change of basis we try to represent under. The question then,
under a given basis what is the class of linear transformation which leads to a
matrix representation that is symmetric? One such example was shown above
which is the projection into the line at . This question needs more time to look
into.
We now write to represent a diagonal matrix, hence the similarity transformation above can
be written as Or
2 Singular value decomposition (SVD)
Using similarity transformation, we found that for a real and symmetric matrix we are able
to decompose it as where is diagonal and contains the eigenvalues of on the diagonal, and
contains the right eigenvectors of in its columns.
2.1 What is right and left eigenvectors?
Right eigenvectors are the standard eigenvectors we have been talking about all the time.
When is an eigenvalues of then is a right eigenvector when we write However, is a left
eigenvector of when we have where is the Hermitian of
2.2 SVD
Given any arbitrary matrix it can be factored into 3 matrices as follows where is a unitary
matrix ( or ), and is also unitary matrix.
These are the steps to do SVD
Find the rank of , say
Let , hence is a square matrix, and it is semi positive definite, hence its
eigenvalues will all be . Find the eigenvalues of , call these . There will be such
eigenvalues since is of order . But only of these will be positive, and will be
zero. Arrange these eigenvalues such that the first non-zero eigenvalues come
first, followed by the zero eigenvalues:
Initialize matrix to be all zeros. Take the the first eigenvalues from above
(non-zero ones), and take the square root of each, hence we get and write these
down the diagonal of starting at , i.e. . Notice that the matrix need not square
matrix. Hence we can obtain an arrangement such as the following for where the
matrix was for example.
Now find each eigenvalue of . For each eigenvalue, find the corresponding
eigenvector. Call these eigenvectors .
Normalize the eigenvectors found in the above step. Now eigenvector will be an
orthonormal set of vectors. Take the Hermitian of each eigenvector and make one
of these vectors (now in row format instead of column format) go into a row in
the matrix i.e. the first row of will be , the second row of will be , etc...
Now we need to find a set of orthonormal vectors, these will be the columns of
the matrix : find a set of orthonormal eigenvector , for . Notice that here we only
use the first vectors found in step 5. Take each one of these vectors and make
them into columns of . But since we need columns in not just , we need to come
up with more basis vectors such that all the vectors form an orthonormal set of
basis vectors for the row space of , i.e. . If doing this by hand, it is easy to find the
this by inspection. In a program, we could use the same process we used with
Gram-Schmidt, where we learned how find a new vector which is orthonormal to
a an existing set of other vectors. Another way to find the matrix is to construct
the matrix and find its eigenvectors, those go into the columns of the matrix as
follows
This completes the factorization, now we have
The following diagram help illustrate the SVD process described above.
3 Conclusion
As a conclusion, the following diagram shows the difference between similarity
transformation factorization of a matrix and SVD.
Trying to understand SVD from geometrical point view requires more time, and this is left
for future research into this subject.
4 References
Linear Algebra and its applications 4th edition by Gilbert Strang
Course notes, Linear Algebra, Math 307, CSUF mathematics department, spring
2007 by Dr Angel Pineda.
Principles and techniques of applied Mathematics by Bernard Friedman, Spectral
theory of operators chapter.
Matrix computation, 3rd edition by Gene Golub and Charles Van Loan.
Numerical analysis: Mathematics of scientific computing, by David Kincaid and
Ward Cheney.