home
PDF

## Review of similarity transformation and Singular Value Decomposition

Applied Mathematics Department, California State University, Fullerton. July 8 2007 page compiled on June 29, 2015 at 9:52pm

### 1 Similarity transformation

A similarity transformation is

Where are square matrices. The goal of similarity transformation is to ﬁnd a matrix which has a simpler form than so that we can use in place of to ease some computational work. Lets set our goal in having be a diagonal matrix (a general diagonal form is called block diagonal or Jordan form, but here we are just looking at the case of being a diagonal matrix).

The question becomes: Given , can we ﬁnd such that is diagonal?

The standard method to show the above is via an algebraic method, which we show ﬁrst. Next we show a geometric method that explains similarity transformation geometrically.

#### 1.1 Derivation of similarity transformation based on algebraic method

Starting with

Our goal is to ﬁnd a real matrix such that it is diagonal.  From the above, by pre multiplying each side by we obtain

Now, since our goal is to make diagonal, let us select the eigenvalues of to be the diagonal of . Now we write the above in expanded form as follows

The above can be written as separate equations by looking at the column view of matrix multiplication

and

all the way to the n column of

Each of the above equations is in the form

and this is the key idea.

The above shows that if we take the diagonal elements of to be the eigenvalues of then the matrix is the (right) eigenvectors of

Hence, if we want the matrix to be diagonal and real, this means the matrix itself must have some conditions put on it. Speciﬁcally, eigenvalues must be all distinct and real. This happens when is real and symmetric. Therefore, we have the general result that given a matrix which has distinct and real eigenvalues, only then can we ﬁnd a similar matrix to it which is real and diagonal, and these diagonal values are the eigenvalues of .

#### 1.2 Derivation of similarity transformation based on geometric method

It turns out that this derivation is more complicated than the algebraic approach. It involves a change of basis matrix (which will be our matrix) and the representation of linear transformation as a matrix under some basis (which will be our matrix). Let us start from the beginning.

Given the vector in , it can have many diﬀerent representations (or coordinates) depending on which basis are used. There are an inﬁnite number of basis that span , hence there are inﬁnite representations for the same vector. Consider for example . The standard basis vectors are , but another basis are , and yet another are , and so on.

In any linearly independent vectors can be used as basis. Let be the vector representation in w.r.t. basis , and let be the vector representation w.r.t. basis .

An important problem in linear algebra is to obtain given and the change of basis matrix .

This requires ﬁnding a matrix representation for the change of basis. Once the matrix is found, we can write

 (1)

Where is the change of basis matrix which when applied to returns the coordinates of the vector w.r.t. basis

From (1) we see that given , then to obtain we write

 (1A)

Another important problem is that of linear transformation , where now we apply some transformation on the whole space and we want to ﬁnd what happens to the space coordinates (all the position vectors) due to this transformation.

Consider for example which is a rotation in by some angle . Here, every position vector in the space is aﬀected by this rotation but the basis remain ﬁxed, and we want to ﬁnd the new coordinates of each point in space.

Let be the position vector before applying the linear transformation, and let be the new position vector. Notice that both vectors are written with respect to the same basis . Similar to change basis, we want a way to ﬁnd from , hence we need a way to represent this linear transformation by a matrix, say with respect to the basis and so we could write the following

 (2)

Assume the basis is changed to . Let the representation of the same linear transformation w.r.t. basis be called , so we write

 (2A)

Hence when the basis changes from to , representation changes from to

Now assume we are given some linear transformation and its representation w.r.t. basis , how could we ﬁnd representation w.r.t. to new basis ?

From (2) we have

But from (1) , hence the above becomes

Pre-multiplying from left the above equation by we obtain

But since , then above reduces to

But from (2A) , hence the above becomes

Therefore

 (3)

Notice the diﬀerence between change of basis and linear transformation. In change of basis, the vector remained ﬁxed in space, but the basis changed, and we want to ﬁnd the coordinates of the vector w.r.t the new basis. With linear transformation, the vector itself is changed from to , but both vectors are expressed w.r.t. the same basis.

Equation (3) allows us to obtain a new matrix representation of the linear transformation by expressing the same w.r.t. to diﬀerent basis. Hence if we are given a representation of which is not the most optimal representation, we can, by change of basis, obtain a diﬀerent representation of the same by using (3). The most optimal matrix representation for linear transformation is a diagonal matrix.

Equation (3) is called a similarity transformation.  To make it easier to compare (3) above with what we wrote in the previous section when we derived the above using an algebraic approach, we let , hence (3) is

The matrix is similar to . (i.e. is similar to . Both matrices represent the same linear transformation applied on the space. We will show below some examples of how to ﬁnd given and . But ﬁrst we show how to obtain matrix representation of

##### 1.2.1 Finding matrix representation of linear transformation

Given a vector and some linear transformation (or a linear operator) that acts on the vector transforming this vector into another vector according to some prescribed manner . Examples of such linear transformation are rotation, elongation, compression, and reﬂection, and any combination of these operations, but it can not include the translation of the vector, since translation is not linear.

The question to answer here is how to write down a representation of this linear transformation? It turns out that can be represented by a matrix of size , and the actual numbers that go into the matrix, or the shape of the matrix, will depend on which basis in we choose to represent .

We would like to pick some basis so that the ﬁnal shape of the matrix is the most simple shape.

Let us pick a set of basis that span hence

Now

And since is linear we can rewrite the above as

 (1)

We see from above that the new transformed vector has the same coordinates as if we view as the new basis.

Now itself is an application of the linear transformation but now it is being done on each basis vector . Hence it will cause each speciﬁc basis vector to transform in some manner to a new vector. Whatever this new vector is, we must be able to represent it in terms of the same basis vectors therefore, we write

Where is the coordinate of the vector

And we do the same for

and so on until

Or

Now plug the above into (1) and obtain

Hence, if we take the as common factors, we have

Hence, since from (1), then by comparing coeﬃcients of each basis vector we obtain

Or in matrix form

Hence we ﬁnally obtain that

Let us call the matrix that represents under basis as

We see from the above that the column of contain the coordinates of the vector . This gives us a quick way to construct : Apply to each basis vector , and take the resulting vector and place it in the column of .

We now see that will have a diﬀerent numerical values if the basis used to span are diﬀerent from the ones used to obtain the above

Let use pick some new basis, say . Let the new representation of now be the matrix , then the column of is found from applying on the new basis

Where now is the coordinates of the the vector which will be diﬀerent from since in one case we used the basis set and in the other case we used the basis set . Hence we see that will numerically be diﬀerent depending on the basis used to span the space even though the linear transformation itself did not change.

##### 1.2.2 Finding matrix representation of change of basis

Now we show how to determine , the matrix which when applied to a vector will result in the vector

Given a vector , it is represented w.r.t. basis as

and w.r.t. basis as

But the vector itself is invariant under any change of basis, hence

 (1)

Now write each basis in terms of the basis , hence we have

Substitute (2) into RHS of (1) we obtain

Factor out the basis from the RHS, we obtain

hence

Now comparing coeﬃcients of each of the basis we obtain the following result

or in Matrix form, we write

 (3)

The above gives a relation between the coordinates of the vector w.r.t. basis (these are the coordinates) to the coordinates of the same vector w.r.t. to basis (these are the coordinates ). The mapping between these coordinates is the matrix shown above which we call the matrix. Since the above matrix returns the coordinates of the vector w.r.t. when it is multiplied by the coordinates of the vector w.r.t. basis , we write it as to make it clear from which basis to which basis is the conversion taking place.

Looking at (2) and (3) above, we see that column in is the coordinates of the w.r.t. basis .

Hence the above gives a method to generate the change of basis matrix . Simply represent each basis in w.r.t. to basis . Take the result of each such representation and write it as a column in . We now show few examples to illustrate this.

Example showing how to ﬁnd change of basis matrix

Given the basis ,

and new basis , ﬁnd the change of basis matrix

Column 1 of is the coordinates of w.r.t. basis . i.e.

 (4)

and column 2 of is the coordinates of w.r.t. basis . i.e.

 (5)

But (4) is

Hence

and , solving, we obtain

and (5) is

Hence and , solving we obtain , , hence

Now we can use the above change of basis matrix to ﬁnd coordinates of the vector under given its coordinates under . For example, given , then

##### 1.2.3 Examples of similarity transformation
Example 1  This example from Strang book (Linear Algebra and its applications, 4th edition), but shown here with little more details. Consider , and let be the projection onto a line at angle . Let the ﬁrst basis and let the second basis .

The ﬁrst step is to ﬁnd

The ﬁrst column of is found by writing w.r.t. basis , and the second column of is found by writing w.r.t. basis ,

Hence

or and

And

Hence and

Hence

Now we need to ﬁnd the representation of w.r.t. basis

The ﬁrst column of is the new coordinates of the basis after applying onto it. but

and the second column of is the new coordinates of the basis after applying onto it. but

Hence

Therefore

Hence we see that the linear transformation is represented as w.r.t. basis , while it is represented as w.r.t. basis . Therefore, it will be better to perform all calculations involving this linear transformation under basis instead of basis

Example 2

Consider , and let be a rotation of space by . Let the ﬁrst basis and let the second basis be .

The ﬁrst step is to ﬁnd

The ﬁrst column of is found by writing w.r.t. basis , and the second column of is found by writing w.r.t. basis ,

Hence

or and and

Hence and

Hence

Now we need to ﬁnd the representation of w.r.t. basis . The ﬁrst column of is the new coordinates of the basis after applying onto it. but

and the second column of is the new coordinates of the basis after applying onto it. but

Hence

Therefore

Hence we see that the linear transformation is represented as w.r.t. basis , while it is represented as w.r.t. basis . Therefore the change of basis selected above did not result in making the representation of any simpler (in fact there was no change in the representation).  This means we need to ﬁnd a more direct method of ﬁnding the basis under which has the simplest representation. Clearly we can't just keep trying diﬀerent basis to ﬁnd if has simpler representation under the new basis.

##### 1.2.4 How to ﬁnd the best basis to simplify the representation of ?

Our goal of simpler representation of is that of a diagonal matrix or as close as possible to being diagonal matrix (i.e. Jordan form). Given and an inﬁnite number of basis that we could select to represent under in the hope we can ﬁnd a new representation such that it is simpler than , we now ask, how does one go about ﬁnding such basis?

It turns out the if we select the eigenvectors of as the columns of , this will result in being diagonal or as close to being diagonal as possible (block diagonal or Jordan form).

Let us apply this to the second example in the previous section. In that example, we had

The eigenvectors of are and ,

Therefore

Which is diagonal representation.

Hence we see that the linear transformation is represented as w.r.t. basis

#### 1.3 Summary of similarity transformation

To obtain such that is real and diagonal requires that be real and symmetric. The eigenvalues of goes into the diagonal of and the eigenvectors of go into the columns of This is an algebraic view.

Geometrically, is viewed as the matrix representation under some basis of a linear transformation . And is the matrix representation of the same linear transformation but now under a new basis and is the matrix that represents the change of basis from to .

The question then immediately arise: If must be real and symmetric (for to be real and diagonal), what does this mean in terms of the linear transformation and change of basis matrix ? This clearly mean that, under some basis , not every linear transformation represented by can have a similar matrix which is real and diagonal. Only those linear transformations which result in real and symmetric representation can have a similar matrix which is real and diagonal. This is shown in the previous examples, where we saw that deﬁned as rotation by under the standard basis resulted in and since this is not symmetric, hence we will not be able to factor this matrix using similarity transformation to a diagonal matrix, no matter which change of basis we try to represent under. The question then, under a given basis what is the class of linear transformation which leads to a matrix representation that is symmetric? One such example was shown above which is the projection into the line at . This question needs more time to look into.

We now write to represent a diagonal matrix, hence the similarity transformation above can be written as

or

### 2 Singular value decomposition (SVD)

Using similarity transformation, we found that for a real and symmetric matrix we are able to decompose it as where is diagonal and contains the eigenvalues of on the diagonal, and contains the right eigenvectors of in its columns.

#### 2.1 What is right and left eigenvectors?

Right eigenvectors are the standard eigenvectors we have been talking about all the time. When is an eigenvalues of then is a right eigenvector when we write

However, is a left eigenvector of when we have

where is the Hermitian of

#### 2.2 SVD

Given any arbitrary matrix it can be factored into 3 matrices as follows where is a unitary matrix ( or ), and is also unitary matrix.

These are the steps to do SVD

1.
Find the rank of , say
2.
Let , hence is a square matrix, and it is semi positive deﬁnite, hence its eigenvalues will all be . Find the eigenvalues of , call these . There will be such eigenvalues since is of order . But only of these will be positive, and will be zero. Arrange these eigenvalues such that the ﬁrst non-zero eigenvalues come ﬁrst, followed by the zero eigenvalues:
3.
Initialize matrix to be all zeros. Take the the ﬁrst eigenvalues from above (non-zero ones), and take the square root of each, hence we get and write these down the diagonal of starting at , i.e. . Notice that the matrix need not square matrix. Hence we can obtain an arrangement such as the following for

where the matrix was for example.

4.
Now ﬁnd each eigenvalue of . For each eigenvalue, ﬁnd the corresponding eigenvector. Call these eigenvectors .
5.
Normalize the eigenvectors found in the above step. Now eigenvector will be an orthonormal set of vectors. Take the Hermitian of each eigenvector and make one of these vectors (now in row format instead of column format) go into a row in the matrix i.e. the ﬁrst row of will be , the second row of will be , etc...
6.
Now we need to ﬁnd a set of orthonormal vectors, these will be the columns of the matrix : ﬁnd a set of orthonormal eigenvector , for . Notice that here we only use the ﬁrst vectors found in step 5. Take each one of these vectors and make them into columns of . But since we need columns in not just , we need to come up with more basis vectors such that all the vectors form an orthonormal set of basis vectors for the row space of , i.e. . If doing this by hand, it is easy to ﬁnd the this by inspection. In a program, we could use the same process we used with Gram-Schmidt, where we learned how ﬁnd a new vector which is orthonormal to a an existing set of other vectors. Another way to ﬁnd the matrix is to construct the matrix and ﬁnd its eigenvectors, those go into the columns of the matrix as follows
7.
This completes the factorization, now we have

The following diagram help illustrate the SVD process described above.

### 3 Conclusion

As a conclusion, the following diagram shows the diﬀerence between similarity transformation factorization of a matrix and SVD.

Trying to understand SVD from geometrical point view requires more time, and this is left for future research into this subject.

### 4 References

1.
Linear Algebra and its applications 4th edition by Gilbert Strang
2.
Course notes, Linear Algebra, Math 307, CSUF mathematics department, spring 2007 by Dr Angel Pineda.
3.
Principles and techniques of applied Mathematics by Bernard Friedman, Spectral theory of operators chapter.
4.
Matrix computation, 3rd edition by Gene Golub and Charles Van Loan.
5.
Numerical analysis: Mathematics of scientiﬁc computing, by David Kincaid and Ward Cheney.