Diagonalizability and minimal polynomial

Invariant subspaces of linear maps: Diagonalizability

Diagonalizability and minimal polynomial

Previously we saw that a number $\lambda$ is an eigenvalue of $L$ if and only if $\det (L-\lambda \cdot I_V)=0$ . This statement can be refined by means of the notion of minimal polynomial.

Let $L:V\to V$ be a linear map from a finite-dimensional vector space $V$ to itself. A number $\lambda$ is an eigenvalue of $L$ if and only if it is a root of the minimal polynomial of $L$ .

If $V$ is a real vector space, then, according to the Fundamental theorem of algebra, the characteristic polynomial of $L$ will factor into a product of linear polynomials and quadratic polynomials with leading coefficients $1$ and negative discriminants. Each of these quadratic polynomials will also occur as a divisor of the minimal polynomial (not necessarily with the same multiplicity).

If $\lambda$ is a root of the minimal polynomial, then it also is a root of the characteristic polynomial (which, after all, is a multiple of the minimal polynomial), and so, according to rule 1 for the Characterization of eigenvalues and eigenvectors, an eigenvalue.

Conversely, if $\lambda$ is an eigenvalue of $L$ with eigenvector $\vec{v}$ , then, for the minimal polynomial $m_L(x)$ , we have $\begin{array}{rcl}m_L(\lambda)\,\vec{v}&=&m_L(L)\,\vec{v} \\&&\phantom{xx}\color{blue}{\lambda\,\vec{v} = L(\vec{v})}\\ &=&\vec{0}\\&&\phantom{xx}\color{blue}{m_L(L)=0_V}\end{array}$ Since $\vec{v}$ is an eigenvector, we have $\vec{v}\ne\vec{0}$ , so $m_L(\lambda) = 0$ . This shows that $\lambda$ is a root of the minimal polynomial.

It remains to prove the last statement. Suppose that $V$ is a real vector space and that $p(x)$ is a quadratic polynomial with leading coefficient $1$ and negative discriminant which divides the characteristic polynomial of $L$ . If $\lambda$ is a root of $p(x)$ , then it is not real, and therefore the complex conjugate $\overline{\lambda}$ is the second root of $p(x)$ . By applying the above to the complexification of $V$ , we see that both $x-\lambda$ and $x-\overline{\lambda}$ occur as factors of the minimal polynomial of $L$ (the minimal polynomial of $L$ on the complexification of $V$ equals the minimal polynomial of $L$ on $V$ ). Because $\lambda\ne\overline{\lambda}$ , the product $p(x) = (x-\lambda) \cdot (x-\overline{\lambda})$ is a divisor of the minimal polynomial.

The difference between the minimal polynomial and the characteristic polynomial is that factors occurring more than once in the characteristic polynomial may occur with a lower multiplicity in the minimal polynomial. The following three matrices illustrate this.

$J_{1} = \matrix{0&0&0\\ 0&0&0\\ 0&0&0},\quad J_{2} = \matrix{0&1&0\\ 0&0&0\\ 0&0&0},\quad J_{3} = \matrix{0&1&0\\ 0&0&1\\ 0&0&0}$

All three have characteristic polynomial $x^3$ . But

$J_{1}$ has minimal polynomial $x$ ,
$J_{2}$ has minimal polynomial $x^2$ ,
$J_{3}$ has minimal polynomial $x^3$ .

These matrices are examples of matrices in Jordan normal form, which we will discuss later.

We use the minimal polynomial for the following characterization of diagonalizability.

Recognizing diagonalizability using the minimal polynomial

Let $V$ be a vector space of finite dimension $n$ with basis $\alpha$ and let $L:V\to V$ be a linear map.

If $V$ is a real vector space, then $L$ is diagonalizable (over the real numbers) if and only if the minimal polynomial of $L$ is a product of a constant and linear factors (with leading coefficients equal to $1$ ), which are all mutually different.
If $V$ is a real vector space, then $L$ is diagonalizable over the complex numbers if and only if every factor of the minimal polynomial of $L$ (written with leading coefficient equal to $1$ ) occurs only once.
If $V$ is a complex vector space, then $L$ is diagonalizable if and only if the minimal polynomial of $L$ has no double roots.

If $L_\alpha$ is a diagonal matrix, and $\lambda_1,\ldots,\lambda_m$ are the mutually different numbers on its diagonal, then the minimal polynomial of $L_\alpha$ , and thus of $L$ , is equal to the product $(x-\lambda_1)\cdots(x-\lambda_m)$

Conversely, if, for mutually different numbers $\lambda_1,\ldots,\lambda_m$ , the minimal polynomial of $L$ is equal to $m_L(x) = (x-\lambda_1)\cdots(x-\lambda_m)$ , then, because of the above theorem Roots of the minimal polynomial, these numbers are eigenvalues of $L$ . Furthermore, with $c_i =\prod_{j\ne i} \frac{1}{\lambda_i-\lambda_j}$ , we have

$1= \sum_{i=1}^m c_i \prod_{j\ne i}(x-\lambda_j)$

This follows from the fact that the right-hand side is a polynomial of degree $m-1$ with the value $1$ at $m$ distinct points $x=\lambda_1,\ldots,\lambda_m$ (see Lagrange's theorem).

We deduce that $V$ is the direct sum of the eigenspaces $E_i = \im{\prod_{j\ne i}(L-\lambda_j\cdot I_V)}$ .

To this end, we first substitute $L$ into the formula above, and study the image of an arbitrary vector $\vec{v}$ under the result:

$\begin{array}{rcl} \vec{v} &=& I_V(\vec{v})\\ &=&\displaystyle\sum_{i=1}^m c_i \prod_{j\ne i}(L-\lambda_j\cdot I_V)(\vec{v})\\ &\in& E_1+\cdots+E_m\end{array}$ This shows that $V$ is the sum of the linear subspaces $E_i$ .

We also see that

$\begin{array}{rcl}(L-\lambda_i\cdot I_V )(E_i) &=&\displaystyle(L-\lambda_i\cdot I_V )\left( \im{\prod_{j\ne i}(L-\lambda_j\cdot I_V)}\right) \\&=&\displaystyle\im{\prod_{j}(L-\lambda_j\cdot I_V)}\\& = &\im{m_L(L)}\\& = &\im{0_V}\\ & =& \{\vec{0}\}\end{array}$ so $E_i$ is contained in $\ker{L-\lambda_i\cdot I_V}$ .

If, for each $i$ , a basis of $E_i$ is chosen, then, according to theorem Independence of eigenvectors for different eigenvalues, the union of these bases is a set of linearly independent vectors. By what we saw above, these vectors span $V$ . Therefore, this union is a basis for $V$ . This shows that $V$ is the direct sum of the eigenspaces $E_i$ .

This proves the first and the last statement. The second statement follows by applying this result to the complexification of $V$ .

The matrix $A = \matrix{1&1\\ 0&1}$ is not diagonalizable. For, otherwise there would be numbers $a$ and $b$ such that $A$ is conjugate to $D=\matrix{a&0\\ 0&b}$ . But then

$\begin{array}{rclclclcl}a+b &=&\text{tr}(D) &=& \text{tr}(A )&=&1+1 &=& 2 \\ a\cdot b &=&\det( D) &=& \det(A) &=&1\cdot 1 -1\cdot 0 &=& 1 \end{array}$ If we substitute $b = 2 - a$ from the first equation into the second equation, we find the quadratic equation $a^2-2 a+1=0$ , which has only one solution: $a =1$ . This implies $b = 1$ , so $D = I_2$ , the identity matrix. That means that there is an invertible $(2\times2)$ -matrix $T$ with $A = T \,I_2T^{-1}=I_2$ . This contradicts the fact that the $(1,2)$ -entry of $A$ is equal to $1$ .

In accordance with the theorem, the minimal polynomial $(x-1)^2$ of $A$ has a double root.

Example

An orthogonal projection $P$ onto a subspace of an inner product space $V$ satisfies the equation $P^2=P$ and so is diagonalizable.
A reflection $S$ about a linear subspace of dimension $n-1$ in an $n$ -dimensional inner product space $V$ satisfies the equation $S^2 = I_V$ , and so is diagonalizable.

A well-known criterion for the absence of multiple factors in the factorization of a polynomial $f(x)$ is $\gcd(f(x),f'(x)) = 1$ Here $\gcd$ stands for "greatest common divisor". With the aid of the Euclidean algorithm for polynomials this greatest common divisor can be found in an efficient manner. This leads to the following method for determining the diagonalizability of a square matrix $A$ :

Determine the minimal polynomial $m_A(x)$ of $A$ .
Calculate $\gcd(m_A(x),\frac{\dd}{\dd x}(m_A(x)))$ .
If this $\gcd$ equals $1$ , then $A$ is diagonalizable over the complex numbers. If so and if $A$ is real, then $A$ is diagonalizable over the real numbers if and only if all of the roots of $m_A(x)$ are real.

Let $V$ be a real vector space of dimension $6$ and suppose that $L:V\to V$ is a linear map with minimal polynomial
$m_L(x ) =x^3+x^2-14 x-24$ Is $L$ diagonalizable over the real numbers?

Yes

To see this, we determine the eigenvalues of $L$ . Trial and error yields that $-3$ is a root of $m_L(x )$ . Division by $x+3$ gives the factorization
$m_L(x ) = (x+3) \cdot ( x^2-2 x-8)$ The quadratic factor factors as $(x-4)\cdot (x+2)$ , so the minimal polynomial has three distinct real roots. According to the first item of theorem Recognizing diagonalizability using the minimal polynomial the answer is therefore: Yes.

New example