We have seen that a linear map #L# of a finite-dimensional real or complex vector space #V# to itself, is not always diagonizable, even if #V# is complex. The problem is that the dimension of the eigenspace of #L# with respect to a root of the characteristic polynomial can be smaller than the multiplicity of that root in #p_L(x)#. Consider for example #V = \mathbb{R}^2# and #L=L_A# for #A=\matrix{0&1\\ 0&0}# with characteristic polynomial #p_L(x) = x^2#. The root #0# has multiplicity #2# and the dimension of the eigenspace of #L# at eigenvalue #0# is #1#.
We will treat this second problem in two steps. First we point out an invariant subspace of #V# that can be bigger than the eigenspace at a given root #\lambda# of the characteristic polynomial, but on which the restriction of #L# to that subspace has a characteristic polynomial that is a power of #x-\lambda#. Later we will indicate a basis for that subspace where the restriction of #L# approximates a diagonal form.
Let #V# be a vector space, #L:V\to V# a linear map, and #\lambda# an eigenvalue of #L#. The generalized eigenspace of #L# with respect to #\lambda# is the subspace \(E_\lambda^*\) consisting of all vectors #\vec{v}# of #V# for which there exists a natural number #k# that satisfies #(L-\lambda\, I_V)^k(\vec{v}) = \vec{0}#. This subspace is invariant under #L#.
To prove that the generalized eigenspace is indeed a linear subspace of #V#, we first notice that #\vec{0}# belongs to \(E_\lambda^*\). Next, if #\vec{v}# lies in \(E_\lambda^*\) then according to the definition there is an integer #k# that satisfies #(L-\lambda\, I_V)^k(\vec{v}) = \vec{0}#. If #\alpha# is a scalar, then we have \[ (L-\lambda\, I_V)^k(\alpha \vec{v}) = \alpha (L-\lambda\, I_V)^k(\vec{v}) =\alpha\,\vec{0} = \vec{0}\] showing that \(\alpha\, \vec{v}\) belongs to \(E_\lambda^*\). Finally, let #\vec{w}# also be a vector in \(E_\lambda^*\). Then, there must be a natural number #\ell# satisfying #(L-\lambda\, I_V)^\ell(\vec{w}) = \vec{0}#. We now have for #m=\max(k,\ell)# \[\begin{array}{rcl}(L-\lambda\, I_V)^m(\vec{v}+\vec{w})& =& (L-\lambda\, I_V)^m(\vec{v})+(L-\lambda\, I_V)^m(\vec{w})\\&&\phantom{xxx}\color{blue}{(L-\lambda\, I_V)^m\text{ is a linear map}}\\& =& (L-\lambda\, I_V)^{m-k}(L-\lambda\, I_V)^{k}(\vec{v})+(L-\lambda\, I_V)^{m-\ell}(L-\lambda\, I_V)^{\ell}(\vec{w})\\&&\phantom{xxx}\color{blue}{\text{composition of linear maps rewritten}}\\ & =& (L-\lambda\, I_V)^{m-k}(\vec{0})+(L-\lambda\, I_V)^{m-\ell}(\vec{0})\\&&\phantom{xxx}\color{blue}{\text{ choice of }k,\, \ell }\\ &=&\vec{0}+\vec{0} \\&&\phantom{xxx}\color{blue}{(L-\lambda\, I_V)^n\text{ is a linear map for all }n\in\mathbb{N}\text{ including }n=0}\\ &=& \vec{0}\end{array}\]from which it follows that \(\vec{v}+\vec{w}\) belongs to \(E_\lambda^*\).
With this we have proven that \(E_\lambda^*\) is a linear subspace of #V#.
To prove the invariance of \(E_\lambda^*\) under #L#, we let #\vec{v}# be a random vector in \(E_\lambda^*\), and prove that #L(\vec{v})# also belongs to \(E_\lambda^*\). From the definition of \(E_\lambda^*\) it follows that there exists a natural number #k# such that \((L - \lambda\, I_V)^k(\vec{v}) = \vec{0}\). Hence, #\vec{v}# belongs to #\ker{(L - \lambda\, I_V)^k}#. Because #M=(L - \lambda\, I_V)^k# commutes with #L#, we can apply theorem Invariance of kernel and image under commuting linear maps to see that #\ker{M}# is invariant under #L#, showing that #L(\vec{v})# belongs to \(E_\lambda^*\), which is what we needed to prove.
If #V = \mathbb{R}^2# and #L=L_A# for #A=\matrix{0&1\\ 0&0}#, then #0# is the only root of the characteristic polynomial #p_L(x) = x^2#. The eigenspace #E_0# of #L# for eigenvalue #0# is spanned by the standard basis vector #\rv{1,0}# and is therefore a proper subspace of #V#, but the generalized eigenspace #E_0^*# of #L# at #0# coincides with #V#.
In the finite-dimensional case, the dimension of the generalized eigenspace of #\lambda# is equal to the multiplicity of #\lambda# in the characteristic polynomial of #L#:
Assume that #V# is a finite-dimensional vector space and that #L:V\to V# is a linear map such that #\lambda# is a root of the characteristic polynomial #p_L(x)#.
- If #\lambda# has multiplicity #\ell# in the minimal polynomial #m_L(x)#, then \(E_\lambda^* =\ker{(L-\lambda\, I_V)^{\ell}}\).
- If #\lambda# has multiplicity #k# in #p_L(x)#, then \(E_\lambda^* =\ker{(L-\lambda\, I_V)^k}\) and we have \(\dim{E_\lambda^*}= k\).
- Write #e_i=\dim{\ker{L-\lambda\,I_V)^i}}# so that #e_\ell=k#. The sequence #e_1,e_2,e_3,\ldots,e_{\ell}# is strictly increasing.
Let #\ell# be the multiplicity of #\lambda# in the minimal polynomial #m_L(x)#. From the definition of generalized eigenspace it is clear that \(\ker{(L-\lambda\, I_V)^{\ell}}\) is enclosed in \(E_\lambda^* \). Suppose that #\vec{w}# is a vector in \({E_\lambda^*}\). Then there exists a natural number #m#, such that #\vec{w}# belongs to \(\ker{(L-\lambda\, I_V)^{m}}\). We will show that we can choose #m\le\ell#. Assume #m\gt\ell#. Because #\ell# is the multiplicity of #\lambda# in #m_L(x)#, we can factor the minimal polynomial as
\[m_L(x) = (x-\lambda)^{\ell}\cdot c(x)\]where #c(x)# is a polynomial with #c(\lambda)\ne0#. In particular we have
\[\gcd\left((x-\lambda)^m,m_L(x)\right) = (x-\lambda)^{\ell}\]The extended Euclidean algorithm gives polynomials #a(x)# and #b(x)# that satisfy
\[a(x)\cdot (x-\lambda)^m+b(x)\cdot m_L(x) = (x-\lambda)^{\ell}\]If we substitute #L# in all polynomials of this equality, we find, thanks to the fact that #m_L(L)=0#,
\[a(L)\, (L-\lambda\,I_V)^m = (L-\lambda\,I_V)^{\ell}\]
such that from \(\vec{w}\in \ker{(L-\lambda\, I_V)^m}\) follows \[(L-\lambda\,I_V)^{\ell}(\vec{w}) = a(L)\left( (L-\lambda\,I_V)^m(\vec{w}) \right) = a(L)(\vec{0}) = \vec{0}\] showing that #\vec{w}# belongs to \(\ker{(L-\lambda\, I_V)^{\ell}}\), such that \(\ker{(L-\lambda\, I_V)^{\ell}}\) coincides with \({E_\lambda^*}\). This proves the first statement.
Since #k\ge\ell# the equality \(E_\lambda^* =\ker{(L-\lambda\, I_V)^k}\) in the second statement follows directly from the first. We will also show that \(\dim{E_\lambda^*}= k\). Reasoning for #k# and #p_L(x)# along the same lines as above for #\ell# and #m_L(x)# we find a polynomial #d(x)# with #d(\lambda)\ne0#, in such a way that \[p_L(x) = (x-\lambda)^k\cdot d(x)\] According to Invariant direct sum we have the following direct sum decomposition of #V# into subspaces that are invariant under #L#:\[V = \ker{(L-\lambda\,I_V)^k}\oplus \ker{d(L)}\]We have already seen that the first summand is equal to \(E_\lambda^*\). According to the theorem determinants of some special matrices the characteristic polynomial #p_L(x)# is the product of the characteristic polynomials of #L# restricted to each summand. The characteristic polynomial of #L# restricted to \(E_\lambda^*\) is of the form #(x-\lambda)^t#, where #t=\dim{E_\lambda^*}#, because the minimal polynomial is a divisor of #(x-\lambda)^k#. On the other hand #x-\lambda# is no divisor of the characteristic polynomial of #L# restricted to #\ker{d(L)}#, since the corresponding minimal polynomial divides #d(x)# and #d(\lambda)\ne0#. We conclude that #k=t#, such that #k=\dim{E_\lambda^*}#.
Concerning the third statement: it is obvious that #e_i\le e_{i+1}# since #\ker{(L-\lambda\,I_V)^i}# is enclosed in #\ker{(L-\lambda\,I_V)^{i+1}}#. To prove that #e_i\lt e_{i+1}# if #i\lt\ell#, we assume that #e_i= e_{i+1}#. We claim that from this follows that #e_i=e_{i+m}# for each natural number #m#. For #m=1# this follows from that assumption. We use full induction to prove this for all #m#. For that use #m\gt 1# and assume that #e_i=e_{i+m-1}# (this is the induction hypothesis). If #\vec{v}# lies in #\ker{(L-\lambda\,I_V)^{i+m}}#, then \[(L-\lambda\,I_V)^{i+m-1}\left((L-\lambda\,I_V)\vec{v}\right)=(L-\lambda\,I_V)^{i+m}(\vec{v})=\vec{0}\] so that #\left(L-\lambda\,I_V\right)\vec{v}# lies in #\ker{(L-\lambda\,I_V)^{i+m-1}}#. Because #e_i=e_{i+m-1}# and #\ker{(L-\lambda\,I_V)^i}# is included in #\ker{(L-\lambda\,I_V)^{i+m-1}}#, we have #\ker{(L-\lambda\,I_V)^i}=\ker{(L-\lambda\,I_V)^{i+m-1}}#. This means that #\left(L-\lambda\,I_V\right)\vec{v}# lies in #\ker{\left(L-\lambda\,I_V\right)^i}#, so that #\vec{v}# lies in #\ker{\left(L-\lambda\,I_V\right)^{i+1}}#. But #\ker{\left(L-\lambda\,I_V\right)^{i+1}}=\ker{\left(L-\lambda\,I_V\right)^i}#, hence #\vec{v}# even lies in #\ker{\left(L-\lambda\,I_V\right)^i}#. Thus, we have deduced that #\ker{(L-\lambda\,I_V)^{i+m}}=\ker{\left(L-\lambda\,I_V\right)^i}#, hence, #e_{i+m}=e_i#.
From the statement we have just proven it follows that if #e_i=e_{i+1}#, also #e_m=e_i# for all #m\gt i#, so #e_i=e_\ell=\dim{E^*_\lambda}#. Since #\ell# is the smallest natural number that satisfies #e_\ell=\dim{E^*_\lambda}#, we see that #e_i=e_{i+1}# only holds for #i\geq\ell# so that #e_1\lt e_2\lt \cdots\lt e_{\ell-1}\lt e_\ell#, which concludes the proof of the theorem.
The strictly increasing sequence #e_1,e_2,e_3,\ldots,e_{\ell}# shows that
\[\ker{L-\lambda\,I_V}\subset\ker{\left(L-\lambda\,I_V\right)^{2} }\subset\cdots\subset\ker{\left(L-\lambda\,I_V\right)^{\ell} }\]
is a sequence of ever increasing invariant subspaces. The first subspace is the eigenspace of #L# with respect to #\lambda#, the last subspace is the generalized eigenspace #E_\lambda^*#. In particular, #e_1\ge1# and #e_{\ell}# is the exponent of #\lambda - x# in the characteristic polynomial of #L#. The inequalities suggest a method for calculating the exponent #\ell# of #x-\lambda# in the minimal polynomial: consecutively calculate the numbers #e_i#, for #i=1,2,\ldots# until #e_i=e_{i+1}#. We then have #i=\ell#.
The numbers #e_i=\dim{\ker{L-\lambda\,I_V)^i}}# do not depend on a basis for #V#. In other words, for conjugate #(n\times n)#-matrices #A# and #B# the numbers #\dim{\ker{(B-\lambda\cdot I_n)^i}}# are equal to #\dim{\ker{(A-\lambda\cdot I_n)^i}}#. Later we will see that this information uniquely determines the conjugation class of #A#; meaning: if for an #(n\times n)#-matrix #B# the values #\dim{\ker{(B-\lambda_i\cdot I_n)^{k_i}}}# are equal to #\dim{\ker{(A-\lambda_i\cdot I_n)^{k_i}}}# for all eigenvalues #\lambda_i# with multiplicity #k_i#, then #A# and #B# are conjugate.
The multiplicity of an eigenvalue #\lambda# as a zero of the characteristic polynomial, that is, the above number #k#, is called the algebraic multiplicity of #\lambda# in #L#. The dimension of #E_\lambda# is often called the geometric multiplicity of #\lambda# in #L#.
Consider the matrix \[ {A} = {\matrix{3 & 0 & 1 & 0 \\ -1 & 4 & 0 & 0 \\ 1 & 0 & 4 & 1 \\ 0 & 0 & -1 & 3 }} \] The characteristic polynomial of #A# is equal to #{\left(x-4\right)^2\cdot \left(x-3\right)^2}#. Therefore, the eigenvalues of #A# are #3# and #4#.
Determine a basis for the generalized eigenspace of #A# corresponding to the eigenvalue #3#.
#\text{basis for }E_{3}^* =# #{\basis{ \matrix{0 \\ 1 \\ 1 \\ -1 \\ } , \matrix{1 \\ 1 \\ 0 \\ -1 \\ } } }#
Since the multiplicity of # 3 # in the characteristic polynomial is equal to #2#, the generalized eigenspace #E_{3}^*# coincides with #\ker{(A-3 \,I_4)^{2}}#.
The obvious method of determining #\ker{(A-3 \,I_4)^{2}}# is as follows: By squaring we find
\[ (A-3 \,I_4)^{2} = \matrix{1 & 0 & 1 & 1 \\ -1 & 1 & -1 & 0 \\ 1 & 0 & 1 & 1 \\ -1 & 0 & -1 & -1 \\ }\] Next, we compute the kernel of this linear map by solving the system of equations \((A-3 \,I_4)^{2} (\vec{x}) = \vec{0}\). This leads to the following basis for \(\ker{(A-3 \,I_4)^{2}}\): \[\basis{ \matrix{0 \\ 1 \\ 1 \\ -1 \\ } , \matrix{1 \\ 1 \\ 0 \\ -1 \\ } } \] Alternatively, we may use theorem
Invariant direct sum according to which the requested basis is also a basis of #\im{(A-4 \,I_4)^2 }#. This subspace is spanned by the columns of the matrix \[(A-4 \,I_4)^2 = \matrix{2 & 0 & -1 & 1 \\ 1 & 0 & -1 & 0 \\ -1 & 0 & 0 & -1 \\ -1 & 0 & 1 & 0 \\ }\] By thinning we find that the following columns are a basis for \(\ker{(A-3 \,I_4)^{2}}\): \[\basis{\cv{ 2 \\ 1 \\ -1 \\ -1 } , \cv{ -1 \\ -1 \\ 0 \\ 1 } } \]