PDF (letter size)
PDF (legal size)

Notes on Sturm Liouville

Nasser M. Abbasi

April 15, 2019   Compiled on April 18, 2021 at 4:50pm


1 Definitions
2 Proofs
 2.1 Proof symmetry of operator
 2.2 Proof eigenvalues are real
 2.3 Proof eigenfunctions are unique
 2.4 Proof eigenfunctions are real
 2.5 Proof eigenfunctions are orthogonal with weight
3 Special functions generated from solving singular Sturm-Liouville
4 General notes on S-L
5 Methods of solutions to some Sturm Liouville problems
6 Examples converting second order linear ODE to Sturm-Liouville
 6.1 Example 1
 6.2 Example 2
 6.3 Example 3
 6.4 Example 4
 6.5 Example 5
 6.6 Example 6
 6.7 Example 7

1 Definitions

Regular Sturm-Liouville ODE is an eigenvalue boundary value ODE. This means the ODE has an eigenvalue in it \(\lambda \), where solutions exists only for specific values of eigenvalues. The ODE is\[ \frac {d}{dx}\left (p\relax (x) \frac {dy}{dx}\right ) +q\left ( x\right ) y\relax (x) +\lambda \sigma \relax (x) y\left ( x\right ) =0\qquad a<x<b \]

Some books use \(-q\relax (x) y\relax (x) \) in the above instead of \(+q\relax (x) y\relax (x) \). The above also can be written as

\[ py^{\prime \prime }+p^{\prime }y^{\prime }+\left (q+\lambda \sigma \right ) y=0 \] With the restrictions that \(p\relax (x) ,q\relax (x) ,\sigma \relax (x) \) are real functions that are continuous everywhere over \(a\leq x\leq b\) and also we need \(p\relax (x) >0,\sigma \left ( x\right ) >0\). The \(\sigma \relax (x) \) is called the weight function. But this is not all. The boundary conditions must be linear homogeneous, of this form\begin {align*} \beta _{1}y\relax (a) +\beta _{2}y^{\prime }\relax (a) & =0\\ \beta _{3}y\relax (b) +\beta _{4}y^{\prime }\relax (b) & =0 \end {align*}

Where \(\beta _{i}\) are just real constants. Some of them can be zero but not all. For example, \(\beta _{1}y\relax (a) =0,\beta _{4}y^{\prime }\left ( b\right ) =0\) is OK.

Boundary conditions do not have to mixed, but they can be in general. But they must be homogeneous.

Notice that periodic boundary conditions are not allowed. Well, they are allowed, but then the problem is no longer called Sturm-Liouville. The above is just the definition of the equation and its boundary conditions. Below is list of the important properties of this ODE. Each one of these properties have a proof.

\(\blacksquare \) All eigenvalues \(\lambda _{n}\) are real. No complex \(\lambda _{n}\).

\(\blacksquare \) Each eigenvalue \(\lambda _{n}\), will have one and only one real eigenfunction \(\phi _{n}\relax (x) \) associated with it. (in higher dimensions, the eigenvalue problem can have more than one eigenfunction associated with one eigenvalue. For example, heat PDE in rectangle). But we can use Gram-schmidt to make these eigenfunctions orthogonal if we need to). But for exam, just worry about 1D for now.

\(\blacksquare \) There is smallest \(\lambda \), called \(\lambda _{1}\) and there are infinite number of eigenvalues. Hence eigenvalues are ordered \(\lambda _{1}<\lambda _{2}<\lambda _{3}\cdots \)

\(\blacksquare \) Each eigenfunction \(\phi _{n}\relax (x) \) will have \(n-1\) zeros in \(a<x<b\). Note that this does not include the end points. This means, \(\phi _{3}\relax (x) \) will have two zeros inside the domain. i.e. there are two \(x\) locations where \(\phi _{3}\relax (x) =0\) in \(a<x<b\).

\(\blacksquare \) Eigenfunctions \(\phi _{n}\relax (x) \) make a complete set of basis. This means any piecewise continuous function \(f\relax (x) \) can be represented as \(f\relax (x) \sim \sum _{n=1}^{\infty }a_{n}\phi _{n}\relax (x) \). This is called the generalized Fourier series.

\(\blacksquare \) This follows from (5). Each eigenfunction is orthogonal to another eigenfunction, but with the weight in there. This means \(\int _{a}^{b}\phi _{n}\relax (x) \phi _{m}\relax (x) \sigma \left ( x\right ) dx=0\) if \(n\neq m\).

\(\blacksquare \) Rayleigh quotient relates an eigenvalue to its eigenfunction. Starting with the SL ODE \(\left (p\phi ^{\prime }\right ) ^{\prime }+q\phi =-\lambda \sigma \phi \), then multiplying by\(\ \phi \) and integrating, we obtain \(\int _{a}^{b}\phi \left (p\phi ^{\prime }\right ) ^{\prime }+q\phi ^{2}dx=\) \(-\int _{a}^{b}\lambda \sigma \phi ^{2}dx\) and solving for \(\lambda \), gives \[ \lambda =\frac {\int _{a}^{b}\phi \left (p\phi ^{\prime }\right ) ^{\prime }+q\phi ^{2}dx}{\int _{a}^{b}\phi ^{2}\sigma \relax (x) dx}\] Carrying integration by parts on first part of the integral in numerator, it becomes \[ \lambda =\frac {\left (-p\phi \phi ^{\prime }\right ) _{b}^{a}+\int _{a}^{b}p\left (\phi ^{\prime }\right ) ^{2}-q\phi ^{2}dx}{\int _{a}^{b}\phi ^{2}\sigma \relax (x) dx}\] But this becomes much simpler when we plug-in the boundary conditions that we must use, making the above \[ \lambda =\frac {\int _{a}^{b}p\left (\phi ^{\prime }\right ) ^{2}-q\phi ^{2}dx}{\int _{a}^{b}\phi ^{2}\sigma \relax (x) dx}\] And if \(q=0,p=1\) and \(\sigma \relax (x) =1\), then it becomes \(\lambda =\frac {\int _{a}^{b}\left (\phi ^{\prime }\right ) ^{2}dx}{\int _{a}^{b}\phi ^{2}dx}.\) Rayleigh quotient is useful to show that \(\lambda \) can be positive without solving for \(\phi \) and also used to estimate the value of the minimum \(\lambda \) by replacing \(\phi \) by a trial function and actually solving for \(\lambda \) using \(\lambda =\frac {\int _{a}^{b}\phi \left (p\phi ^{\prime }\right ) ^{\prime }+q\phi ^{2}dx}{\int _{a}^{b}\phi ^{2}\sigma \relax (x) dx}\) to obtain a numerical estimate of \(\lambda _{1}\).

\(\blacksquare \) There is symmetry relation. In operator form, let \(L\equiv \frac {d}{dx}\left (p\relax (x) \frac {d}{dx}\right ) +q\left ( x\right ) \), then we have \(\int _{a}^{b}uL\left [ v\right ] -vL\left [ u\right ] dx=0\) where \(u,v\) are any two different eigenfunctions.

\(\blacksquare \) There is also what is called Lagrange identity, which says \(uL\left [ v\right ] -vL\left [ u\right ] =\frac {d}{dx}\left (p\left ( uv^{\prime }-vu^{\prime }\right ) \right ) \). This comes from simply expanding \(L[v],L\left [ u\right ] \) and simplifying things. Just calculus.

\(\blacksquare \) Green formula, follows from Lagrange identity, which just gives the integral form \(\int _{a}^{b}uL\left [ v\right ] -vL\left [ u\right ] dx=\left [ p\left (uv^{\prime }-vu^{\prime }\right ) \right ] _{a}^{b}\). But form Sturm-Liouville, we know that \(\int _{a}^{b}uL\left [ v\right ] -vL\left [ u\right ] dx=0\) (from 8). So this really just says that \(\left [ p\left ( uv^{\prime }-vu^{\prime }\right ) \right ] _{a}^{b}=0\). But we know this already from boundary conditions. So I am not sure why this is useful for Sturm-Liouville now. Since it is just saying the same thing again.

\(\blacksquare \) If \(p\relax (x) =0\) at the left or right end (boundaries), then the problem is now called singular Sturm-Liouville. This is actually the important case. In regular S.L., \(p\relax (x) \) must be positive everywhere. We only consider \(p\relax (x) =0\) at the ends, not in the middle or any other place. When this happens, then solutions to the ODE generate special functions, such as Bessel, Legendre, Chebyshev. These special functions are solution to the Singular S.L. ODE, not the regular S.L. ODE. In addition, at the end where \(p\relax (x) =0\), the boundary conditions must be bounded. Not the regular boundary conditions above. For example, if \(p\relax (a) =0\) where \(x=a\) is say the left end, the the boundary conditions at the left end must be \(y\relax (a) <\infty ,y^{\prime }\relax (a) <\infty \). Notice that singular can be at left end, right end or at both at same time. At the end where \(p\relax (x) =0\), the boundary conditions must be bounded type.

\(\blacksquare \) Symmetry relation \[ \int _{a}^{b}uL\left [ v\right ] -vL\left [ u\right ] dx=0 \] where \(u,v\) are are two eigenfunction. But remember, this does NOT mean the integrand is identically zero, or \(uL\left [ v\right ] -vL\left [ u\right ] =0\). Only when \(u,v\) happened to have same eigenvalue we can say that \(uL\left [ v\right ] -vL\left [ u\right ] =0\). But for unique eigenfunctions only the integral form is zero \(\int _{a}^{b}uL\left [ v\right ] -vL\left [ u\right ] dx=0.\) This is important difference, used in the proof below, that different eigenfunctions have different eigenvalues (for the scalar SL case). For higher dimensions, we can have more than one eigenfunction for same eigenvalue.

Small note added April 18, 2021. The above is the same as showing the operator \(L\) is Hermitian. In bra/ket Dirac notation, an operator \(L\) is Hermitian when

\begin {align*} \langle u|L|v\rangle ^{\ast } & =\langle v|L|u\rangle \\ \left (\int u^{\ast }\relax (x) L\left [ v\relax (x) \right ] dx\right ) ^{\ast } & =\int v^{\ast }\relax (x) L\left [ u\left ( x\right ) \right ] dx\\ \int u\relax (x) L\left [ v^{\ast }\relax (x) \right ] dx & =\int v^{\ast }\relax (x) L\left [ u\relax (x) \right ] dx \end {align*}

In the proofs below, I did not do the complex conjugate, since it is assumed all eigenfunctions \(u,v\) are real, and I only wrote that the symmetry relation implies

\[ \int u\relax (x) L\left [ v\relax (x) \right ] dx=\int v\left ( x\right ) L\left [ u\relax (x) \right ] dx \]

It should be really called the Hermitian relation to be more accurate. But for real eigenfunctions, it is the same.

2 Proofs

2.1 Proof symmetry of operator

Given regular Sturm-Liouville (RSL) ODE\begin {align} \left (py^{\prime }\right ) ^{\prime }+qy & =-\lambda \sigma y\tag {1}\\ py^{\prime \prime }+p^{\prime }y^{\prime }+qy & =-\lambda \sigma y\nonumber \end {align}

In operator form \[ L\equiv \frac {d}{dx}\left (p\relax (x) \frac {d}{dx}\right ) +q\left ( x\right ) \] And (1) becomes\[ L[y]=-\lambda \sigma y \] When solving RSL ode, since it is an eigenvalue ODE with associated boundary conditions, we will get infinite number of non-negative eigenvalues.

For each eigenvalue, there is associated with it one eigenfunction (for 1D case). Looking at any two different eigenfunctions, say \(u,v\), then the symmetry relation says the following\[ \int _{a}^{b}uL\left [ v\right ] dx=\int _{a}^{b}vL\left [ u\right ] dx \] Now we will show the above is true. This requires integration by parts two times. We start from the LHS expression and at the end we should end up with the integral on the RHS. Let \(I=\int _{a}^{b}uL\left [ v\right ] dx\), then \begin {align} I & =\int _{a}^{b}uL\left [ v\right ] dx\tag {1}\\ & =\int _{a}^{b}u\left (\frac {d}{dx}\left (p\frac {d}{dx}v\right ) +qv\right ) dx\nonumber \end {align}

Where we used \(L\left [ v\right ] =\left (\frac {d}{dx}\left (p\frac {d}{dx}\right ) +q\right ) v=\left (\frac {d}{dx}\left (p\frac {d}{dx}v\right ) +qv\right ) \) in the above. Hence\begin {equation} I=\int _{a}^{b}u\overset {dV}{\overbrace {\frac {d}{dx}\left (p\frac {dv}{dx}\right ) }}dx+\int _{a}^{b}qvudx \tag {2} \end {equation} Now we will do integration by parts on the first integral. Let \(I_{2}=\int _{a}^{b}u\frac {d}{dx}\left (p\frac {dv}{dx}\right ) dx\). Using \(\int UdV=UV-\int VdU\), and if we let \(U=u,dV=\frac {d}{dx}\left (p\frac {dv}{dx}\right ) \), then \(dU=\frac {du}{dx},V=p\frac {dv}{dx}\). Hence \[ I_{2}=\left (up\frac {dv}{dx}\right ) _{a}^{b}-\int _{a}^{b}\overset {dV}{\overbrace {\frac {dv}{dx}}}\overset {U}{\overbrace {p\frac {du}{dx}}}dx \] We now apply integration by parts again to the second integral above. But now let \(U=p\frac {du}{dx}\) and \(dV=\frac {dv}{dx}\), hence \(dU=\frac {d}{dx}\left ( p\frac {du}{dx}\right ) \) and \(V=v\), therefore the above becomes\[ I_{2}=\left (up\frac {dv}{dx}\right ) _{a}^{b}-\left [ \left (p\frac {du}{dx}v\right ) _{a}^{b}-\int _{a}^{b}v\frac {d}{dx}\left (p\frac {du}{dx}\right ) dx\right ] \] Substituting the above into (2) gives\begin {equation} I=\left (up\frac {dv}{dx}\right ) _{a}^{b}-\left [ \left (p\frac {du}{dx}v\right ) _{a}^{b}-\int _{a}^{b}v\frac {d}{dx}\left (p\frac {du}{dx}\right ) dx\right ] +\int _{a}^{b}qvudx \tag {3} \end {equation} Now comes the part where the boundary conditions are important. In RSL, the boundary conditions are such that all the terms \(\left (up\frac {dv}{dx}\right ) _{a}^{b},\left (p\frac {du}{dx}v\right ) _{a}^{b}\) vanish. This is because the boundary conditions are\begin {align*} \beta _{1}u\relax (a) +\beta _{2}u^{\prime }\relax (a) & =0\\ \beta _{3}u\relax (b) +\beta _{4}u^{\prime }\relax (b) & =0 \end {align*}

And\begin {align*} \beta _{1}v\relax (a) +\beta _{2}v^{\prime }\relax (a) & =0\\ \beta _{3}v\relax (b) +\beta _{4}v^{\prime }\relax (b) & =0 \end {align*}

So now (3) becomes\begin {align*} I & =-\left [ -\int _{a}^{b}v\frac {d}{dx}\left (p\frac {du}{dx}\right ) dx\right ] +\int _{a}^{b}qvudx\\ & =\int _{a}^{b}v\frac {d}{dx}\left (p\frac {du}{dx}\right ) dx+\int _{a}^{b}qvudx\\ & =\int _{a}^{b}v\left (\frac {d}{dx}\left (p\frac {du}{dx}\right ) +qu\right ) dx\\ & =\int _{a}^{b}vL\left [ u\right ] dx \end {align*}

Therefore, we showed that \(\int _{a}^{b}uL\left [ v\right ] dx=\int _{a}^{b}vL\left [ u\right ] dx\). The only thing to watch for here, is which term to make \(U\) and which to make \(dV\) when making integration by parts. To remember, always start with \(uL[v]\) and make \(dV\) the term with \(\frac {dv}{dx}\) in them during integration parts.

2.2 Proof eigenvalues are real

Assume \(\lambda \) is complex. The corresponding eigenfunctions are complex also. Let \(\phi \) be corresponding eigenfunction\begin {equation} L\left [ \phi \right ] =-\lambda \sigma \phi \tag {1} \end {equation} Taking complex conjugate of both sides, and since \(\sigma \) is real, we obtain\[ \overline {L\left [ \phi \right ] }=-\overline {\lambda }\sigma \overline {\phi }\] But \(\overline {L\left [ \phi \right ] }=L\left [ \overline {\phi }\right ] \) since all the coefficients of the ODE are real. The above becomes\begin {equation} L\left [ \overline {\phi }\right ] =-\overline {\lambda }\sigma \overline {\phi } \tag {2} \end {equation} But by symmetry, we know that\[ \int _{a}^{b}\phi L\left [ \overline {\phi }\right ] -\overline {\phi }L\left [ \phi \right ] dx=0 \] Substituting  (1),(2) into the above gives\begin {align*} \int _{a}^{b}\phi \left (-\overline {\lambda }\sigma \overline {\phi }\right ) -\overline {\phi }\left (-\lambda \sigma \phi \right ) dx & =0\\ \int _{a}^{b}-\overline {\lambda }\sigma \phi \overline {\phi }+\lambda \sigma \overline {\phi }\phi dx & =0\\ \int _{a}^{b}\left (\lambda -\overline {\lambda }\right ) \left (\sigma \overline {\phi }\phi \right ) dx & =0\\ \left (\lambda -\overline {\lambda }\right ) \int _{a}^{b}\sigma \overline {\phi }\phi dx & =0 \end {align*}

But \(\overline {\phi }\phi =\left \vert \phi \right \vert \) which is positive. Also the weight \(\sigma \relax (x) \) is positive by definition. Hence for the above to zero, it must be that \(\lambda =\overline {\lambda }\). Which means \(\lambda \) is real. QED.

So the main tools to use in this proof: Definition of \(L\left [ \phi \right ] =-\lambda \sigma \phi \,,\) and symmetry relation and that \(\overline {L\left [ \phi \right ] }=L\left [ \overline {\phi }\right ] \). This might come up in the exam.

2.3 Proof eigenfunctions are unique

Now will show that there is one eigenfunction associated with each eigenvalue (Again, this is for 1D, it is possible to get more than one eigenfunction for same eigenvalue for 2D, as mentioned earlier).  By contradiction, assume that \(\lambda \) has two eigenfunctions \(\phi _{1},\phi _{2}\) associated with it. Hence\begin {align*} L\left [ \phi _{1}\right ] & =-\lambda \sigma \phi _{1}\\ L\left [ \phi _{2}\right ] & =-\lambda \sigma \phi _{2} \end {align*}

From the first equation, \(\lambda =\frac {-L\left [ \phi _{1}\right ] }{\sigma \phi _{1}}\), substituting this into the second equation gives\begin {align*} L\left [ \phi _{2}\right ] & =\frac {L\left [ \phi _{1}\right ] }{\phi _{1}}\phi _{2}\\ \phi _{1}L\left [ \phi _{2}\right ] -\phi _{2}L\left [ \phi _{1}\right ] & =0 \end {align*}

By Lagrange identity, \(\phi _{1}L\left [ \phi _{2}\right ] -\phi _{2}L\left [ \phi _{1}\right ] =\frac {d}{dx}\left (p\left (\phi _{1}\phi _{2}^{\prime }-\phi _{2}\phi _{1}^{\prime }\right ) \right ) \), hence this means that \begin {align*} \frac {d}{dx}\left (p\left (\phi _{1}\phi _{2}^{\prime }-\phi _{2}\phi _{1}^{\prime }\right ) \right ) & =0\\ p\left (\phi _{1}\phi _{2}^{\prime }-\phi _{2}\phi _{1}^{\prime }\right ) & =c_{1} \end {align*}

Where \(c_{1}\) is some constant. This is the main difference between the above argument, and between Lagrange identity. This can be confusing. So let me talk more about this. In Lagrange identity, we write \[ \phi _{1}L\left [ \phi _{2}\right ] -\phi _{2}L\left [ \phi _{1}\right ] =\frac {d}{dx}\left (p\left (\phi _{1}\phi _{2}^{\prime }-\phi _{2}\phi _{1}^{\prime }\right ) \right ) \] And when \(\phi _{2},\phi _{1}\) also satisfy the SL boundary condition, only then we say that\begin {align*} \int _{a}^{b}\phi _{1}L\left [ \phi _{2}\right ] -\phi _{2}L\left [ \phi _{1}\right ] dx & =0\\ \left . p\left (\phi _{1}\phi _{2}^{\prime }-\phi _{2}\phi _{1}^{\prime }\right ) \right \vert _{a}^{b} & =0 \end {align*}

But the above is not the same as saying \(\phi _{1}L\left [ \phi _{2}\right ] -\phi _{2}L\left [ \phi _{1}\right ] =0\). This is important to keep in mind. Only the integral form is zero for any two functions with the SL B.C..Now we continue. We showed that \(p\left (\phi _{1}\phi _{2}^{\prime }-\phi _{2}\phi _{1}^{\prime }\right ) =c\). In SL, this constant is zero due to B.C. Hence\[ p\left (\phi _{1}\phi _{2}^{\prime }-\phi _{2}\phi _{1}^{\prime }\right ) =0 \] But \(p>0\) by definition. Hence \(\phi _{1}\phi _{2}^{\prime }-\phi _{2}\phi _{1}^{\prime }=0\) or\begin {align*} \frac {d}{dx}\left (\frac {\phi _{2}}{\phi _{1}}\right ) & =0\\ \frac {\phi _{2}}{\phi _{1}} & =c_{2} \end {align*}

or \(\phi _{2}=c_{2}\phi _{1}\). So the eigenfunctions are linearly dependent. One is just scaled version of the other. But eigenfunction must be linearly independent.

Hence assumption is not valid, and there can not be two linearly independent eigenfunctions for same eigenvalue. Notice also that \(\phi _{1}\phi _{2}^{\prime }-\phi _{2}\phi _{1}^{\prime }=0\) is the just the Wronskian. When it is zero, we know the functions are linearly dependent. The important part in the above proof, is that \(\phi _{1}\phi _{2}^{\prime }-\phi _{2}\phi _{1}^{\prime }=0\) only when \(\phi _{1},\phi _{2}\) happened to have same eigenvalue.

2.4 Proof eigenfunctions are real

The idea of this proof is to assume the eigenfunction is complex, then show that its real part and its complex part both satisfy the ODE and the boundary conditions. But since they are both use the same eigenvalue, then the real part and the complex part must be linearly dependent. This implies the eigenfunction must be real. (think of Argand diagram)

Assume that \(\phi =U+iV\) is complex eigenfunction with real part \(U\) and complex part \(V\). Then since\[ L\left [ \phi \right ] =-\lambda \sigma \phi \] The above is just writing the Sturm-Liouville ODE in operator form, where \(L\) is the operator as above.  Now we have\[ L\left [ U+iV\right ] =-\lambda \sigma \left (U+iV\right ) \] By linearity of operator \(L\)\[ L\left [ U\right ] +iL\left [ V\right ] =-\lambda \sigma U-i\lambda \sigma V \] Which implies\begin {align*} L\left [ U\right ] & =-\lambda \sigma U\\ L\left [ V\right ] & =-\lambda \sigma V \end {align*}

So we showed that the real and complex part satisfy S.L. Now we need to show they are satisfy S.L. boundary conditions. Since\begin {align*} \beta _{1}\phi \relax (a) +\beta _{2}\phi ^{\prime }\relax (a) & =0\\ \beta _{3}\phi \relax (b) +\beta _{4}\phi ^{\prime }\relax (b) & =0 \end {align*}

Where \(x=a,x=b\) are the left and right ends of the domain. Then\begin {align*} \beta _{1}\left (U\relax (a) +iV\relax (a) \right ) +\beta _{2}\left (U^{\prime }\relax (a) +iV^{\prime }\relax (a) \right ) & =0\\ \beta _{3}\left (U\relax (b) +iV\relax (b) \right ) +\beta _{4}\left (U^{\prime }\relax (b) +iV^{\prime }\relax (b) \right ) & =0 \end {align*}

Hence\begin {align*} \beta _{1}U\relax (a) +\beta _{2}U^{\prime }\relax (a) & =0\\ \beta _{1}V\relax (a) +\beta _{2}V^{\prime }\relax (a) & =0\\ \beta _{3}U\relax (b) +\beta _{4}U^{\prime }\relax (b) & =0\\ \beta _{3}V\relax (b) +\beta _{4}V^{\prime }\relax (b) & =0 \end {align*}

So the above means both \(U\) and \(V\) satisfy the boundary conditions of S.L. But since both \(U,V\) have the same eigenvalue, then they must be linearly dependent, since we know with S.L. each eigenfunction (or one linearly dependent to it) have only one eigenvalue. This means\[ V=cU \] Where \(c\) is some constant. In other words, \begin {align*} \phi & =U+iV\\ & =U+icU\\ & =U\left (1+ic\right ) \\ & =c_{0}U \end {align*}

Where \(c_{0}\) is new constant. (OK it happens to be complex constant, but it is OK to do so, we always do this trick in other places, if it will make me feel better, I could take the magnitude of the constant). So all what the above says, is that we assumed \(\phi \) to be complex, and found that it is real. So it can’t be complex.

2.5 Proof eigenfunctions are orthogonal with weight

Given two different eigenfunctions \(\phi _{1},\phi _{2}\). Hence\begin {align} L\left [ \phi _{1}\right ] & =-\lambda _{1}\sigma \phi _{1}\tag {1}\\ L\left [ \phi _{2}\right ] & =-\lambda _{2}\sigma \phi _{2} \tag {2} \end {align}

From symmetry integral relation, since these eigenfunctions also satisfy S.L. boundary conditions, we can write\[ \int _{a}^{b}\phi _{1}L\left [ \phi _{2}\right ] -\phi _{2}L\left [ \phi _{1}\right ] dx=0 \] Replacing (1,2) into the above\begin {align*} \int _{a}^{b}\phi _{1}\left (-\lambda _{2}\sigma \phi _{2}\right ) -\phi _{2}\left (-\lambda _{1}\sigma \phi _{1}\right ) dx & =0\\ \int _{a}^{b}-\lambda _{2}\sigma \phi _{1}\phi _{2}+\lambda _{1}\sigma \phi _{2}\phi _{1}dx & =0\\ \int _{a}^{b}\left (\lambda _{1}-\lambda _{2}\right ) \left (\sigma \phi _{1}\phi _{2}\right ) dx & =0\\ \left (\lambda _{1}-\lambda _{2}\right ) \int _{a}^{b}\sigma \phi _{1}\phi _{2}dx & =0 \end {align*}

But \(\left (\lambda _{1}-\lambda _{2}\right ) \neq 0\) since there are different eigenvalues for different eigenfunctions. Hence \[ \int _{a}^{b}\sigma \phi _{1}\relax (x) \phi _{2}\relax (x) dx=0 \] Which means \(\phi _{1,}\phi _{2}\) are orthogonal to each others with weight \(\sigma \relax (x) \)

3 Special functions generated from solving singular Sturm-Liouville

When S.L. is singular, meaning \(p=0\) at one or both ends, we end with important class of ODE’s, whose solutions are special functions (not \(\sin \) or \(\cos \)) as the case with the regular S.L. Recall that S.L. is\begin {align} \frac {d}{dx}\left (p\relax (x) \frac {dy}{dx}\right ) +q\left ( x\right ) y\relax (x) +\lambda \sigma \relax (x) y\left ( x\right ) & =0\tag {1}\\ py^{\prime \prime }+p^{\prime }y^{\prime }+qy & =-\lambda \sigma y\nonumber \end {align}

With the regular S.L., we say that \(p\relax (x) >0\) over the whole domain, including end points. But with singular, this is not the case.  Here are three important S.L. ODE’s that are singular.

Bessel equation: \begin {align*} x^{2}y^{\prime \prime }+xy+\lambda x^{2}y & =0\qquad 0<x<1\\ xy^{\prime \prime }+y+\lambda xy & =0 \end {align*}

Or in standard form\[ \left (xy^{\prime }\right ) ^{\prime }=-\lambda xy \] Comparing to SL form, then \(p\relax (x) =x,q\relax (x) =0,\sigma \relax (x) =x\). At \(x=0\), then \(p\relax (x) =0\), which is what makes it singular. (also the weight happens to be zero), but we only care about \(p\) being zero or not, at one of the ends. So to check if S.L. is singular or not, just need to check if \(p\relax (x) \) is zero or not at one of the ends.

As mentioned before, when \(p=0\) at one of the ends, we can’t use the standard B.C. for the regular S.L. instead, at the end where \(p=0\), we must use what is called bounded boundary conditions, which is in this case \(y\relax (0) <\infty ,y^{\prime }\relax (0) <\infty \). The solution to this ODE will be in terms of Bessel functions. Notice that this happened to be singular, due to the domain starting at \(x=0\). If the domain happened to be from \(x=0.5\) to say \(x=1\), then it is no longer singular S.L. but regular one.

Legendre equation \[ \left (\left (1-x^{2}\right ) y^{\prime }\right ) ^{\prime }=-\lambda y\qquad 0<x<1 \] Or it can be written as\[ \left (1-x^{2}\right ) y^{\prime \prime }-2xy^{\prime }=-\lambda y \] We see that \(p=1-x^{2}\). And now it happens to be that \(p=0\) at \(x=1\). So it is singular at the other end compared to Bessel. In this case \(q=0,\sigma =1\). Again, the boundary conditions at \(x=1\) must now be bounded. i.e. \(y\left ( 1\right ) <\infty ,y^{\prime }\relax (1) <\infty \). On the other end, where \(p\) is not zero, we still use the standard boundary conditions  \(\beta _{1}y\relax (0) +\beta _{2}y^{\prime }\relax (0) =0\).

Chebyshev equation\[ \left (\sqrt {1-x^{2}}y^{\prime }\right ) ^{\prime }=-\lambda \sqrt {1-x^{2}}y\qquad -1<x<1 \] Or\[ \sqrt {1-x^{2}}y^{\prime \prime }-\frac {1}{2}\frac {2x}{\sqrt {1-x^{2}}}y^{\prime }=-\lambda \sqrt {1-x^{2}}y \] So we see that \(p=\sqrt {1-x^{2}},q=0,\sigma =\sqrt {1-x^{2}}\). So where is \(p=0\) here? At \(x=-1\) then \(p=\sqrt {1-1}=0\) and at \(x=-1\) then \(p=0\) also. So this is singular at both ends. So we need to use bounded boundary conditions at both ends now to solve this.\begin {align*} y\left (-1\right ) & <\infty \\ y^{\prime }\left (-1\right ) & <\infty \\ y\left (+1\right ) & <\infty \\ y^{\prime }\left (+1\right ) & <\infty \end {align*}

The solution to this singular S.L. is given in terms of special function Chebyshev.

4 General notes on S-L

\(\blacksquare \) Solution to \(xy^{\prime \prime }+y^{\prime }=-\lambda xy\). Set in S-L form, we get \(\left (py^{\prime }\right ) ^{\prime }+qy=-\lambda \sigma y\), then \(p=x,q=0,\sigma =x\). For \(a<x<b\) with \(y\relax (b) =0,y\left ( a\right ) <\infty \). This is singular S-L. Using Frobenius series, the solution comes out to be \(y\relax (x) =\sum _{n=1}^{\infty }c_{n}J_{0}\left ( \sqrt {\lambda _{n}}x\right ) \), where \(J_{0}\) is Bessel function of first kind. \(\sqrt {\lambda _{n}}\) are roots of \(0=J_{0}\left (\sqrt {\lambda }b\right ) \). (check if this should be \(0=J_{0}\left (\sqrt {\lambda }\left ( b-a\right ) \right ) \). \(c_{n}\) is constant, which still needs to be found.

\(\blacksquare \) We can find approximate solution to S-L ODE for large eigenvalue, without solving the ODE using series method, but by using WKB. Given \(\left (py^{\prime }\right ) ^{\prime }+qy=-\lambda \sigma y\) which is standard form S-L. Using physical optics correction, we obtain the solution as\[ y\relax (x) \sim \left (\sigma \relax (x) p\relax (x) \right ) ^{\frac {-1}{4}}\left (c_{1}\cos \left (\sqrt {\lambda }\int _{0}^{x}\sqrt {\frac {\sigma \relax (t) }{p\relax (t) }}dt\right ) +c_{1}\sin \left (\sqrt {\lambda }\int _{0}^{x}\sqrt {\frac {\sigma \left ( t\right ) }{p\relax (t) }}dt\right ) \right ) \] Where \(c_{1},c_{2}\) are found from boundary conditions now. The above is valid for ”large” \(\lambda \), and is found by first letting \(\varepsilon ^{2}=\frac {1}{\lambda }\) and then assuming \(y\relax (x) =\exp \left (\frac {1}{\varepsilon }\sum _{n=0}^{\infty }\varepsilon ^{n}S_{n}\relax (x) \right ) \). And working though the WKB method. Remember, WKB only works for linear homogeneous ODE and is used to estimate solution for large \(\lambda \) (or small \(\varepsilon \)).

\(\blacksquare \) in 1D, S-L is \(\left (py^{\prime }\right ) ^{\prime }+qy=-\lambda \sigma y\) with \(0<x<L\), with B.C. given by \(\beta _{1}y\left ( 0\right ) +\beta _{2}y^{\prime }\relax (0) =0\) and \(\beta _{1}y\left ( L\right ) +\beta _{2}y^{\prime }\relax (L) =0,\) and in higher dimensions, this problem becomes \(\nabla \cdot \left (p\nabla \phi \right ) +q\phi =-\lambda \sigma \phi \) with boundary conditions now written as \(\beta _{1}\phi +\beta _{2}\left (\nabla \phi \cdot \hat {n}\right ) =0\) on boundary \(\Gamma \). In higher dimensions, \(\phi \equiv \phi \left (\bar {x}\right ) \).

\(\blacksquare \) Lagrange’s identity in 1D is \[ uL\relax (v) -vL\relax (u) =\frac {d}{dx}\left (p\left ( u\frac {dv}{dx}-v\frac {du}{dx}\right ) \right ) \] where \(L\) above is the S-L operator, as in \(L\equiv \frac {d}{dx}\left ( p\frac {d}{dx}\right ) +q\) so that we write \(L\relax (u) =-\lambda \sigma y\,\).  

\(\blacksquare \) When it says eigenfunctions are normalized, it should mean that  \(\int _{a}^{b}\phi _{n}^{2}\relax (x) \sigma \relax (x) \ dx=1\) where \(\sigma \relax (x) \) is the weight function (comes from SL equation) and \(a<x<b\) is the domain. This is for 1D SL. Remember for example, that \(\int _{0}^{L}\cos ^{2}\left (\sqrt {\lambda _{n}}x\right ) dx=\frac {L}{2}\). where \(\sqrt {\lambda _{n}}=\frac {n\pi }{L}\). Here, it is not normalized since the result is not one. The weight here is just one and here \(\phi _{n}\left ( x\right ) =\cos \left (\frac {n\pi }{L}x\right ) \)

\(\blacksquare \) Heat PDE in cylinder is \(\frac {\partial T\left ( r,z,\theta ,t\right ) }{\partial t}=k\left (\frac {\partial ^{2}T}{\partial r^{2}}+\frac {1}{r}\frac {\partial T}{\partial r}+\frac {\partial ^{2}T}{\partial z^{2}}+\frac {1}{r^{2}}\frac {\partial ^{2}T}{\partial \theta ^{2}}\right ) .\) For steady state, it becomes, say cylinder has length \(L\) \begin {align*} 0 & =\nabla ^{2}T\left (r,z,\theta \right ) =\frac {\partial ^{2}T}{\partial r^{2}}+\frac {1}{r}\frac {\partial T}{\partial r}+\frac {\partial ^{2}T}{\partial z^{2}}+\frac {1}{r^{2}}\frac {\partial ^{2}T}{\partial \theta ^{2}}\\ \left \vert T\left (0,z,\theta \right ) \right \vert & <\infty \\ \left \vert T^{\prime }\left (0,z,\theta \right ) \right \vert & <\infty \\ T\left (r,0\right ) & =f\relax (r) \\ R\left (r,L\right ) & =g\relax (r) \end {align*}

And for axis symmetric (no \(\theta \) dependency), it becomes \[ 0=\nabla ^{2}T\left (r,z\right ) =\frac {\partial ^{2}T}{\partial r^{2}}+\frac {1}{r}\frac {\partial T}{\partial r}+\frac {\partial ^{2}T}{\partial z^{2}}\] The solution is found by separation of variables. For the above is in terms of Bessel function order zero\[ T\left (r,z\right ) =\sum _{n=1}^{\infty }J_{0}\left (\sqrt {\lambda _{0}}r\right ) \left (A_{n}e^{\sqrt {\lambda _{n}}z}+B_{n}e^{\sqrt {\lambda _{n}}z}\right ) \] which is found using series method \(T=\sum _{n=0}^{\infty }a_{n}r^{n+\alpha }\). The eigenvalues \(\lambda _{n}\) are zeros of \(0=J_{0}\left (\sqrt {\lambda }r_{0}\right ) \) where \(r_{0}\) is disk radius.

\(\blacksquare \) Heat PDE in 2D polar is \(\frac {\partial T\left ( r,\theta ,t\right ) }{\partial t}=k\left (\frac {\partial ^{2}T}{\partial r^{2}}+\frac {1}{r}\frac {\partial T}{\partial r}+\frac {1}{r^{2}}\frac {\partial ^{2}T}{\partial \theta ^{2}}\right ) \) and for steady state it becomes \(0=\frac {\partial ^{2}T}{\partial r^{2}}+\frac {1}{r}\frac {\partial T}{\partial r}+\frac {1}{r^{2}}\frac {\partial ^{2}T}{\partial \theta ^{2}}\). After separating, the \(r\) ODE becomes Euler ODE. Use guess \(R\relax (r) =r^{p}\) for solution. The solution will be\[ T\left (r,z\right ) =A_{0}+\sum _{n=1}^{\infty }A_{n}\cos \left (n\theta \right ) r^{n}+\sum _{n=1}^{\infty }B_{n}\sin \left (n\theta \right ) r^{n}\] \(\blacksquare \) To change \(p\relax (x) y^{\prime \prime }+q\left ( x\right ) y^{\prime }+r\relax (x) y\relax (x) =0\) to S.L. form, multiply the ODE by \(\mu =\frac {1}{p\relax (x) }\exp \left (\int \frac {q\relax (x) }{p\relax (x) }dx\right ) \)

\(\blacksquare \) In using method of eigenfunction expansion to solve nonhomogeneous B.C. PDE, the eigenfunction used \(\phi _{n}\relax (x) \) are the ones from solving the PDE with homogeneous B.C. For example, if given \(\frac {\partial u}{\partial t}=k\frac {\partial ^{2}u}{\partial x^{2}}\) with nonhomogeneous B.C. , say \(u\left (0,t\right ) =A,u\left (L,t\right ) =B\), and we want to use \(u\left (x,t\right ) \sim \sum _{n=1}^{\infty }b_{n}\left ( t\right ) \phi _{n}\relax (x) \) on the nonhomogeneous B.C. PDE, then those \(\phi _{n}\relax (x) \) are from the corresponding homogeneous B.C. PDE. They should be something like \(\sin \left (\frac {n\pi }{L}x\right ) \) and so on.

See the table at the top. That is why we write \(\sim \) above and not \(=\) and remember not to do term-by-term differentiation in \(x\) when this is the case. We can do term-by-term differentiation only of the PDE itself also has homogeneous B.C. , even if it had a source term there also. The point is, \(\phi _{n}\relax (x) \) always come from solving the homogeneous B.C. PDE. (This normally means solving a Sturm-Liouville ODE).

\(\blacksquare \) I found new relation for eigenvalue \(\lambda \), but it is probably not useful as the one the book has. Here how to derive it
Given S-L \begin {equation} \frac {d}{dx}\left (p\frac {dy}{dx}\right ) +qy=-\lambda \sigma y \tag {1} \end {equation} Let \(\phi _{n}\relax (x) \) be the eigenfunction associated with eigenvalue \(\lambda _{n}\). Since eigenfunctions satisfy the ODE itself, then we can write, for any arbitrary eigenfunction (subscript removed for clarity in what follows)\begin {align*} \frac {d}{dx}\left (p\phi ^{\prime }\right ) +q\phi & =-\lambda \sigma \phi \\ p^{\prime }\phi ^{\prime }+p\phi ^{\prime \prime }+q\phi & =-\lambda \sigma \phi \end {align*}

Integrating both sides\begin {equation} \int _{a}^{b}p^{\prime }\phi ^{\prime }dx+\int _{a}^{b}p\phi ^{\prime \prime }dx+\int _{a}^{b}q\phi dx=-\lambda \int _{a}^{b}\sigma \phi dx \tag {2} \end {equation} Looking at \(\int _{a}^{b}p\phi ^{\prime \prime }dx\). Integrating by part. Let \(u=p,dv=\phi ^{\prime \prime }\rightarrow du=p^{\prime },v=\phi ^{\prime }\), hence\begin {align} \int _{a}^{b}p\phi ^{\prime \prime }dx & =\left [ uv\right ] _{a}^{b}-\int _{a}^{b}vdu\nonumber \\ & =\left [ p\phi ^{\prime }\right ] _{a}^{b}-\int _{a}^{b}p^{\prime }\phi ^{\prime }dx \tag {3} \end {align}

Substituting (3) into (2) gives\begin {align*} \int _{a}^{b}p^{\prime }\phi ^{\prime }dx+\left [ p\phi ^{\prime }\right ] _{0}^{L}-\int _{a}^{b}p^{\prime }\phi ^{\prime }dx+\int _{a}^{b}q\phi dx & =-\lambda \int _{a}^{b}\sigma \phi dx\\ \left [ p\phi ^{\prime }\right ] _{0}^{L}+\int _{a}^{b}q\phi dx & =-\lambda \int _{a}^{b}\sigma \phi dx \end {align*}

Hence\[ \lambda =-\frac {\left [ p\phi ^{\prime }\right ] _{a}^{b}+\int _{a}^{b}q\phi dx}{\int _{a}^{b}\sigma \phi dx}\] compare to the one in the book \(\lambda =\frac {\left (-p\phi \phi ^{\prime }\right ) _{b}^{a}+\int _{a}^{b}p\left (\phi ^{\prime }\right ) ^{2}-q\phi ^{2}dx}{\int _{a}^{b}\phi ^{2}\sigma \relax (x) dx}\).

\(\blacksquare \) This is about using eigenfunction expansion to approximate \(f\relax (x) \). Given the eigenfunctions \(\Phi _{n}\relax (x) \) found by solving SL \(L\left [ y\right ] =\lambda ry\) problem, then these eigenfunctions are called complete in the mean sense, if \(\sum _{n=1}^{\infty }b_{n}\Phi _{n}\relax (x) \) converges to the mean of \(f\relax (x) \) at each point in \(a\leq x\leq b\).

This requires only that \(f\relax (x) \) be square integrable over the domain. We can also say that the eigenfunctions are complete, in pointwise sense, if \(\sum _{n=1}^{\infty }b_{n}\Phi _{n}\relax (x) \) converges to \(f\relax (x) \) at each point in \(a\leq x\leq b\). This however requires that \(f\relax (x) \) be continuous in \(a\leq x\leq b\) and \(f^{\prime }\relax (x) \) be piecewise continuous.

So clearly the first case is less strict than the second. This means if \(f\relax (x) \) is only square integrable (and not necessarily continuos), then convergence to the mean of \(f\relax (x) \) at each point can be obtained. To obtain the more strict point wise convergence, more restrictions are needed on \(f\relax (x) \) as mentioned above. Clearly point wise convergence implies convergence in the mean. But not the other way around.

\(\blacksquare \) For Sturm-Liouville, write it as \(py^{\prime \prime }+p^{\prime }y^{\prime }+\left (q+\lambda \sigma \right ) y=0\). Now it is easy to compare to. For example, \(x^{\prime \prime }+\lambda x=0\), then we see that \(p=1,q=0,\sigma =1\). And for \(xy^{\prime \prime }+x^{\prime }+\lambda xy=0\) then \(p=1,q=0,\sigma =x\). Do not divide by \(x\) first to re-write \(xy^{\prime \prime }+x^{\prime }+\lambda xy=0\) as \(y^{\prime \prime }+\frac {1}{x}x^{\prime }+\lambda y=0\) and then compare, as this makes it hard to compare to standard form and will make error.  Books write Sturm-Liouville as \(\frac {d}{dx}\left ( py^{\prime }\right ) +qy+\lambda \sigma y=0\) but I find this confusing to compare to this way. Better to expand it to \(py^{\prime \prime }+p^{\prime }y^{\prime }+\left (q+\lambda \sigma \right ) y=0\) first so now it is in a more familiar form and can read it out more directly.

5 Methods of solutions to some Sturm Liouville problems

Problem 1 If the problem gives S-L equations, and asks to find estimate on the smallest eigenvalue, then use Rayleigh quotient for \(\lambda \). And write \(\lambda _{\min }=\lambda _{1}\leq \frac {\left (-p\phi \phi ^{\prime }\right ) _{b}^{a}+\int _{a}^{b}p\left (\phi ^{\prime }\right ) ^{2}-q\phi ^{2}dx}{\int _{a}^{b}\phi ^{2}\sigma \relax (x) dx}\). Now we do not need to solve the SL to find solution \(\phi \), this is the whole point. Everything in RHS is given, except for, of course the solution \(\phi \).

Here comes the main idea: Come up with any trial \(\phi _{trial}\) and use it in place of \(\phi \). This trial function just needs to satisfy the boundary conditions, which is also given. Then all what we need to do is just evaluate the integral. Pick the simplest \(\phi _{trial}\) function of \(x\) which satisfies boundary conditions. All other terms \(p,q,\sigma \) we can read from the given problem.

At the end, we should get a numerical value for the integral. This is the upper limit of the lowest eigenvalue \(\lambda _{1}\)

Problem 2 We are given SL problem with boundary conditions, and asked to show that \(\lambda \geq 0\) without solving the ODE to find \(\phi \):

Use Rayleigh quotient and argue that the denominator can’t be zero (else eigenfunction is zero, which is not possible) and it also can’t be negative. Argue about the numerator not being negative. Do not solve the ODE! this is the whole point of using Rayleigh quotient.

Problem We are given an SL problem with boundary conditions and asked to estimate large \(\lambda \) and corresponding eigenfunction.

This is different from being asked to estimate the smallest eigenvalue, where we use Rayleigh quotient and trial function. In this one, we instead use WKB. For 1D, just use what is called the physical optics correction method, given by \(\phi \relax (x) \approx \left (\sigma p\right ) ^{\frac {-1}{4}}\left (c_{1}\cos \left (\sqrt {\lambda }\int _{0}^{x}\sqrt {\frac {\sigma \left ( t\right ) }{p\relax (t) }}dt\right ) +c_{2}\sin \left (\sqrt {\lambda }\int _{0}^{x}\sqrt {\frac {\sigma \relax (t) }{p\relax (t) }}dt\right ) \right ) \). Where \(\sigma ,p\) in this are all known functions (from the problem itself). Notice that \(q\) is not there.

Now use the boundary conditions and solve for one of the constants, should come to be zero. Use the second boundary condition and (remember, the boundary conditions are homogenous), and we get one equation in \(\lambda \) and for non-trivial solution, solve for allowed values of \(\lambda _{n}\). This gives the large eigenvalue estimate. i.e. for \(\lambda _{n}\) when \(n\) is very large. Depending on problem, \(n\) does not have be too large to get good accurate estimate compared with exact solution. See HW7, problem 5.9.2 for example.

Problem We are given 1D heat ODE \(\frac {\partial u}{\partial t}=k\frac {\partial ^{2}u}{\partial x^{2}}\). If B.C. are homogenous, we are done. We know the solution. Separation of variables. If the B.C. is not homogenous, then we have a small problem. We can’t do separation of variables. If you can find equilibrium solution, \(u_{r}\relax (x) \), where this solution only needs to satisfy the non-homogenous B.C. then write solution as \(u\left (x,t\right ) =v\left (x,t\right ) +u_{r}\relax (x) \), and plug this in the PDE.

This will give \(\frac {\partial v}{\partial t}=k\frac {\partial ^{2}v}{\partial x^{2}}\) but this now satisfies the homogenous B.C. This is the case if the original non-homogenous B.C. were Dirichlet. If they were Neumann, then we will get \(\frac {\partial v}{\partial t}=k\frac {\partial ^{2}v}{\partial x^{2}}+extra\) where \(extra\) is extra term that do not vanish because \(u_{r}\left ( x\right ) \) do not vanish in this case after double differentiating. We solve for \(v\left (x,t\right ) \) now, since it has homogenous B.C., but it has extra term there, which we treat as new source. We apply initial conditions now also to final all the eigenfunction expansion coefficients. We are done. Now we know \(u\left (x,t\right ) .\)

But if we can find the reference function \(u_{r}\) or we do not want to use this method, then we can use another method, called eigenfunction expansion. We assume \(u\left (x,t\right ) =\sum _{n=1}^{\infty }a_{n}\relax (t) \phi _{n}\relax (x) \) and plug this in the PDE.  But now we can’t do term by term differentiation, since \(\phi _{n}\) are the eigenfunctions that satisfies the homogenous B.C., while \(u\left (x,t\right ) \) has non-homogenous B.C.

So the trick is to use Green formula, to rewrite \(\int _{0}^{L}k\frac {\partial ^{2}u}{\partial x^{2}}\phi _{n}\relax (x) dx\) as \(\int _{0}^{L}ku\frac {\partial ^{2}\phi _{n}\relax (x) }{\partial x^{2}}dx\) plus the contribution from the boundary terms. (this is like doing integration by parts twice, but much easier). Now rewrite \(\frac {\partial ^{2}\phi _{n}\left ( x\right ) }{\partial x^{2}}=-\lambda _{n}\phi _{n}\relax (x) \). See page 355-356, Haberman.

6 Examples converting second order linear ODE to Sturm-Liouville

Any linear second order ODE can be converted to S-L form. The S-L form is the following\begin {equation} \left (p\relax (x) \frac {dy}{dx}\right ) ^{\prime }-q\relax (x) y\relax (x) +\lambda r\relax (x) y\relax (x) =0 \tag {1} \end {equation} Some books write the above with a plus sign \(+q\relax (x) y\left ( x\right ) \) there (for example, Haberman) and some with \(-q\relax (x) y\relax (x) \) for example Boyce/DiPrima. I really never understood why some put a minus sign and some do not. May be I’ll find out one day. But for now will use (1) is used as the S-L form. If your book uses the plus sign, the same process works, just flip the sign.

The goal now, is given any general form eigenvalue ODE or second order ODE (with no eigenvalue \(\lambda \)), we want to convert it (rewrite it) in the above form. The second order linear ODE will have this form\begin {equation} a\relax (x) y^{\prime \prime }+b\relax (x) y^{\prime }+\left ( c\relax (x) +\lambda \right ) y=0 \tag {2} \end {equation} The parameter \(\lambda \) in the S-L form, is the eigenvalue. We really only use S-L form for eigenvalue problems, but if the \(\lambda \) is missing, then the form of the input will be \[ a\relax (x) y^{\prime \prime }+b\relax (x) y^{\prime }+c\left ( x\right ) y=0 \] And the process will work the same way.

First, we will show how to convert (2) to (1), and then show few examples. We are given (2), and want to convert it to (1). The first step is to convert (2) to standard form\[ y^{\prime \prime }+\frac {b\relax (x) }{a\relax (x) }y^{\prime }+\frac {c\relax (x) +\lambda }{a\relax (x) }y=0 \] Then multiply the above by some unknown \(p\relax (x) \) which is assumed positive. This is called the integrating factor\[ p\relax (x) y^{\prime \prime }+p\relax (x) \frac {b\left ( x\right ) }{a\relax (x) }y^{\prime }+p\relax (x) \frac {c\left ( x\right ) +\lambda }{a\relax (x) }y=0 \] Now, rewrite \(p\relax (x) y^{\prime \prime }\) as \(\left (p\left ( x\right ) y^{\prime }\right ) ^{\prime }-p^{\prime }\relax (x) y^{\prime }\) in the above\begin {align} \left (\left (p\relax (x) y^{\prime }\right ) ^{\prime }-p^{\prime }\relax (x) y^{\prime }\right ) +p\relax (x) \frac {b\left ( x\right ) }{a\relax (x) }y^{\prime }+p\relax (x) \frac {\left ( c\relax (x) +\lambda \right ) }{a\relax (x) }y & =0\nonumber \\ \left (p\relax (x) y^{\prime }\right ) ^{\prime }+y^{\prime }\left ( p\relax (x) \frac {b\relax (x) }{a\relax (x) }-p^{\prime }\relax (x) \right ) +p\relax (x) \frac {\left (c\left ( x\right ) +\lambda \right ) }{a\relax (x) }y & =0 \tag {3} \end {align}

Here comes the main trick in all of this. We want to force \(p\relax (x) \frac {b\relax (x) }{a\relax (x) }-p^{\prime }\relax (x) \) to be zero. This implies \[ p\relax (x) =e^{\int \frac {b\relax (x) }{a\relax (x) }dx}\] This comes from just solving the ODE \(p\relax (x) \frac {b\left ( x\right ) }{a\relax (x) }-p^{\prime }\relax (x) =0\). Therefore, if \(p\relax (x) =e^{\int \frac {b\relax (x) }{a\relax (x) }dx}\), then (3) becomes\begin {align} \left (p\relax (x) y^{\prime }\right ) ^{\prime }+\left (p\left ( x\right ) \frac {c\relax (x) +\lambda }{a\relax (x) }\right ) y & =0\nonumber \\ \left (p\relax (x) y^{\prime }\right ) ^{\prime }+\left (p\left ( x\right ) \frac {c\relax (x) }{a\relax (x) }+\frac {p\left ( x\right ) }{a\relax (x) }\lambda \right ) y & =0\nonumber \\ \left (p\relax (x) y^{\prime }\right ) ^{\prime }+p\relax (x) \frac {c\relax (x) }{a\relax (x) }y+\frac {p\relax (x) }{a\relax (x) }\lambda y & =0 \tag {4} \end {align}

We are done. Compare (4) to the SL form\[ \left (p\relax (x) \frac {dy}{dx}\right ) ^{\prime }-q\relax (x) y\relax (x) +\lambda r\relax (x) y\relax (x) =0 \]  We see that, given \(a\relax (x) y^{\prime \prime }+b\relax (x) y^{\prime }+\left (c\relax (x) +\lambda d\right ) y=0\) then

\begin {align*} p\relax (x) & =e^{\int \frac {b}{a}dx}\\ q\relax (x) & =-p\frac {c}{a}\\ r\relax (x) & =\frac {p}{a}d \end {align*}

Let see how this works on some examples

6.1 Example 1

Convert \(y^{\prime \prime }+5\lambda y=0\) to S-L.  We see the general form here is \(a\relax (x) y^{\prime \prime }+b\relax (x) y^{\prime }+\left ( c\relax (x) +\lambda d\right ) y=0\), with \(a=1,b=0,c=0,d=5\). Hence \(p\relax (x) =e^{\int \frac {b\relax (x) }{a\relax (x) }dx}=1\), therefore the SL form is\begin {equation} \left (p\relax (x) \frac {dy}{dx}\right ) ^{\prime }-q\relax (x) y\relax (x) +\lambda r\relax (x) y\relax (x) =0 \tag {1} \end {equation} Where \begin {align*} p\relax (x) & =1\\ q\relax (x) & =-p\relax (x) \frac {c\relax (x) }{a\relax (x) }=0\\ r\relax (x) & =\frac {p\relax (x) }{a\relax (x) }d=5 \end {align*}

Hence (1) becomes\[ \left (y^{\prime }\right ) ^{\prime }+5\lambda y=0 \] This was easy, since the ODE given was already in SL form.

6.2 Example 2

Convert Bessel ODE to SL \[ x^{2}y^{\prime \prime }+xy^{\prime }+\left (x^{2}-n^{2}\right ) y=0 \]

Comparing the above to \(a\relax (x) y^{\prime \prime }+b\relax (x) y^{\prime }+\left (c\relax (x) +\lambda \right ) y=0\), we see that \(a=x^{2},b=x,c=x^{2},d=1,\lambda =-n^{2}\). Hence \[ p\relax (x) =e^{\int \frac {b\relax (x) }{a\relax (x) }dx}=e^{\int \frac {x}{x^{2}}dx}=e^{\ln x}=x \] Therefore the SL form is\begin {equation} \left (p\relax (x) \frac {dy}{dx}\right ) ^{\prime }-q\relax (x) y\relax (x) +\lambda r\relax (x) y\relax (x) =0 \tag {1} \end {equation} Where \begin {align*} p\relax (x) & =x\\ q\relax (x) & =-p\relax (x) \frac {c\relax (x) }{a\relax (x) }=-x\left (\frac {x^{2}}{x^{2}}\right ) =-x\\ r\relax (x) & =\frac {p\relax (x) }{a\relax (x) }d=\frac {1}{x} \end {align*}

Therefore (1) becomes\begin {equation} \left (xy^{\prime }\right ) ^{\prime }+xy-\frac {n^{2}}{x}y\relax (x) =0 \tag {2} \end {equation} Let check if the above gives back \(x^{2}y^{\prime \prime }+xy^{\prime }+\left ( x^{2}-n^{2}\right ) y=0\). Expanding the above gives \[ xy^{\prime \prime }+y^{\prime }+xy-\frac {n^{2}}{x}y=0 \] Multiplying by \(x\)\begin {align*} x^{2}y^{\prime \prime }+xy^{\prime }+x^{2}y-n^{2}y & =0\\ x^{2}y^{\prime \prime }+xy^{\prime }+\left (x^{2}-n^{2}\right ) y & =0 \end {align*}

This verifies that (2) is the Sturm-Liouville form for Bessel ODE

6.3 Example 3

Convert Legendre ODE to SL\[ \left (1-x^{2}\right ) y^{\prime \prime }-2xy^{\prime }+n\left (n+1\right ) y=0 \] The general form here is \(a\relax (x) y^{\prime \prime }+b\left ( x\right ) y^{\prime }+\left (c\relax (x) +\lambda \right ) y=0\), with \(a=\left (1-x^{2}\right ) ,b=-2x,c=0,d=1\) and \(\lambda =n\left (n+1\right ) \). Hence \begin {align*} p\relax (x) & =e^{\int \frac {-2x}{\left (1-x^{2}\right ) }dx}=e^{\ln \left (x^{2}-1\right ) }=x^{2}-1\\ q\relax (x) & =-p\frac {c}{a}=0\\ r\relax (x) & =\frac {p}{a}d=\frac {x^{2}-1}{1-x^{2}}=-1 \end {align*}

Hence SL is\begin {align*} \left (py^{\prime }\right ) ^{\prime }-qy+\lambda ry & =0\\ \left (\left (x^{2}-1\right ) y^{\prime }\right ) ^{\prime }-n\left ( n+1\right ) y & =0 \end {align*}

Let check. Expanding the above gives\[ \left (x^{2}-1\right ) y^{\prime \prime }+2xy^{\prime }-n\left (n+1\right ) y=0 \] Multiplying both sides by \(-1\) gives \[ \left (1-x^{2}\right ) y^{\prime \prime }-2xy^{\prime }+n\left (n+1\right ) y=0 \] Which is the original form of Legendre ODE

6.4 Example 4

Convert \[ y^{\prime \prime }-\frac {2x}{1-x^{2}}y^{\prime }+\frac {\lambda }{1-x^{2}}y=0 \] Rewrite as\[ \left (1-x^{2}\right ) y^{\prime \prime }-2xy^{\prime }+\lambda y=0 \] The general form here is \(a\relax (x) y^{\prime \prime }+b\left ( x\right ) y^{\prime }+\left (c\relax (x) +\lambda \right ) y=0\), with \(a=\left (1-x^{2}\right ) ,b=-2x,c=0,d=1\). Hence \[ \mu \relax (x) =e^{\int \frac {b}{a}dx}=e^{-\int \frac {2x}{1-x^{2}}dx}=e^{-\left (-\ln \left (x-1\right ) -\ln \left (x+1\right ) \right ) }=e^{\ln \left (x-1\right ) }e^{\ln \left (x+1\right ) }=\left (x-1\right ) \left (x+1\right ) =x^{2}-1 \] Therefore the SL form is\begin {equation} \left (p\relax (x) \frac {dy}{dx}\right ) ^{\prime }-q\relax (x) y\relax (x) +\lambda r\relax (x) y\relax (x) =0 \tag {1} \end {equation} Where \begin {align*} p\relax (x) & =x^{2}-1\\ q\relax (x) & =-p\relax (x) \frac {c\relax (x) }{a\relax (x) }=0\\ r\relax (x) & =\frac {p\relax (x) }{a\relax (x) }d=\frac {x^{2}-1}{x^{2}-1}=1 \end {align*}

Hence (1) becomes\[ \left (\left (x^{2}-1\right ) \frac {dy}{dx}\right ) ^{\prime }+\lambda y\relax (x) =0 \]

6.5 Example 5

Convert \[ \left (1-x^{2}\right ) y^{\prime \prime }-xy^{\prime }+\lambda y=0 \] We see the general form here is \(a\relax (x) y^{\prime \prime }+b\left ( x\right ) y^{\prime }+\left (c\relax (x) +\lambda \right ) y=0\), with \(a=\left (1-x^{2}\right ) ,b=-x,c=0,d=1\). Hence \[ p\relax (x) =e^{\int \frac {b}{a}dx}=e^{-\int \frac {x}{1-x^{2}}dx}\] But \(\int \frac {x}{1-x^{2}}dx=-\frac {1}{2}\ln \left \vert 1-x^{2}\right \vert \), therefore \(\mu \relax (x) =e^{\frac {1}{2}\ln \left \vert 1-x^{2}\right \vert }=\sqrt {1-x^{2}}\). Therefore the SL form is\begin {equation} \left (p\relax (x) \frac {dy}{dx}\right ) ^{\prime }-q\relax (x) y\relax (x) +\lambda r\relax (x) y\relax (x) =0 \tag {1} \end {equation} Where \begin {align*} p\relax (x) & =\sqrt {1-x^{2}}\\ q\relax (x) & =-p\relax (x) \frac {c\relax (x) }{a\relax (x) }=0\\ r\relax (x) & =\frac {p\relax (x) }{a\relax (x) }d=\frac {\sqrt {1-x^{2}}}{\left (1-x^{2}\right ) }=\frac {1}{\sqrt {1-x^{2}}} \end {align*}

Hence (1) becomes\[ \left (\sqrt {1-x^{2}}\frac {dy}{dx}\right ) ^{\prime }+\frac {\lambda }{\sqrt {1-x^{2}}}y\relax (x) =0 \]

6.6 Example 6

Convert\[ a\relax (x) y^{\prime \prime }+b\relax (x) y^{\prime }+c\left ( x\right ) y=0 \] The general form here is \(a\relax (x) y^{\prime \prime }+b\left ( x\right ) y^{\prime }+\left (c\relax (x) +\lambda \right ) y=0\), with \(a=a\relax (x) ,b=b\relax (x) ,c=c\relax (x) ,d=1\) and \(\lambda =0\). Hence \[ p\relax (x) =e^{\int \frac {b}{a}dx}\] Therefore the SL form is\begin {equation} \left (p\relax (x) \frac {dy}{dx}\right ) ^{\prime }-q\relax (x) y\relax (x) +\lambda r\relax (x) y\relax (x) =0 \tag {1} \end {equation} Where \begin {align*} p\relax (x) & =e^{\int \frac {b}{a}dx}\\ q\relax (x) & =-p\relax (x) \frac {c\relax (x) }{a\relax (x) }=-e^{\int \frac {b}{a}dx}\frac {c\relax (x) }{a\relax (x) }\\ r\relax (x) & =\frac {p\relax (x) }{a\relax (x) }d=e^{\int \frac {b}{a}dx} \end {align*}

Hence (1) becomes\begin {equation} \left (e^{\int \frac {b}{a}dx}y^{\prime }\right ) ^{\prime }+e^{\int \frac {b}{a}dx}\frac {c\relax (x) }{a\relax (x) }y\relax (x) =0 \tag {2} \end {equation} We see that the above is the same as \(a\relax (x) y^{\prime \prime }+b\relax (x) y^{\prime }+c\relax (x) y=0\) because\[ \left (e^{\int \frac {b}{a}dx}y^{\prime }\right ) ^{\prime }=\frac {b}{a}e^{\int \frac {b}{a}dx}y^{\prime }+e^{\int \frac {b}{a}dx}y^{\prime \prime }\] And hence (2) becomes\[ \frac {b}{a}e^{\int \frac {b}{a}dx}y^{\prime }+e^{\int \frac {b}{a}dx}y^{\prime \prime }+e^{\int \frac {b}{a}dx}\frac {c\relax (x) }{a\left ( x\right ) }y\relax (x) =0 \] Canceling\(\ e^{\int \frac {b}{a}dx}\) gives\begin {align*} \frac {b}{a}y^{\prime }+y^{\prime \prime }+\frac {c\relax (x) }{a\left ( x\right ) }y\relax (x) & =0\\ ay^{\prime \prime }+by^{\prime }+cy & =0 \end {align*}

Which is the original ODE.

6.7 Example 7

Convert\[ 3y^{\prime \prime }+2y^{\prime }+5y=0 \] The general form here is \(a\relax (x) y^{\prime \prime }+b\left ( x\right ) y^{\prime }+\left (c\relax (x) +\lambda \right ) y=0\), with \(a=3,b=2,c=5,d=1\) and \(\lambda =0\). Hence \[ p\relax (x) =e^{\int \frac {b}{a}dx}=e^{\int \frac {2}{3}dx}=e^{\frac {2}{3}x}\] Therefore the SL form is\begin {equation} \left (p\relax (x) \frac {dy}{dx}\right ) ^{\prime }-q\relax (x) y\relax (x) +\lambda r\relax (x) y\relax (x) =0 \tag {1} \end {equation} Where \begin {align*} p\relax (x) & =e^{\frac {2}{3}x}\\ q\relax (x) & =-p\relax (x) \frac {c\relax (x) }{a\relax (x) }=-e^{\frac {2}{3}x}\left (\frac {5}{2}\right ) \\ r\relax (x) & =\frac {p\relax (x) }{a\relax (x) }d=\frac {e^{\frac {2}{3}x}}{3} \end {align*}

Hence (1) becomes\begin {equation} \left (e^{\frac {2}{3}x}y^{\prime }\right ) ^{\prime }+e^{\frac {2}{3}x}\left ( \frac {5}{2}\right ) y\relax (x) =0 \tag {2} \end {equation} The above is the same as \(3y^{\prime \prime }+2y^{\prime }+5y=0\) because\[ \left (e^{\frac {2}{3}x}y^{\prime }\right ) ^{\prime }=\frac {2}{3}e^{\frac {2}{3}x}y^{\prime }+e^{\frac {2}{3}x}y^{\prime \prime }\] And hence (2) becomes\[ \frac {2}{3}e^{\frac {2}{3}x}y^{\prime }+e^{\frac {2}{3}x}y^{\prime \prime }+e^{\frac {2}{3}x}\left (\frac {5}{2}\right ) y\relax (x) =0 \] Canceling\(\ e^{\frac {2}{3}x}\) gives\begin {align*} \frac {2}{3}y^{\prime }+y^{\prime \prime }+\frac {5}{2}y\relax (x) & =0\\ 2y^{\prime \prime }+3y^{\prime }+5y & =0 \end {align*}

Which is the original ODE.