Minimal polynomial

Matrix calculus: Minimal polynomial

Minimal polynomial

Let #n# be a natural number and #A# an #(n\times n)#-matrix. If #p# is a polynomial with #p(A) = 0# (the zero matrix), then we also say that #A# is a zero of #p#. The characteristic polynomial #p_A# of an #(n\times n)#-matrix satisfies #p_A(A) = 0# and has degree #n#. But sometimes there are polynomials of lower degree of which #A# is a zero (that is: yielding the zero matrix when you substitute #A#). We recall that a polynomial is called monic if its leading coefficient equals #1#.

Minimal polynomial Let #n# be a natural number and #A# an #(n\times n)#-matrix.

There is a unique monic polynomial #m_A(x)# of minimal degree such that #m_A(A) = 0#. This polynomial is called the minimal polynomial of #A#.
The minimal polynomial of a matrix #A# is equal to the minimal polynomial of each conjugate of #A#. In particular, we can speak of the minimal polynomial of a linear map #L:V\to V#, where #V# is an #n#-dimensional vector space, which is the minimal polynomial of the #(n\times n)#-matrix #L_\alpha# with respect to an arbitrarily chosen basis #\alpha# for #V#. We then also write #m_L# rather than #m_A#.
Each polynomial #f(x)# that satisfies #f(A) = 0# is a multiple of #m_A(x)#. In particular, #m_A# divides #p_A#.
Each root of the characteristic polynomial of #A# is a root of the minimal polynomial of #A#.

1. There exists a polynomial #p(x)# with the property #p(A) = 0#, namely the characteristic polynomial as we know thanks to the Cayley-Hamilton theorem. Then there also is such a polynomial of minimal degree. Dividing it by the leading coefficient gives a monic polynomial #m# of minimal degree satisfying #m(A) = 0#.

Suppose that #m_1(x)# and #m_2(x)# both are monic and of minimal degree such that #m_1(A) = m_2(A) = 0#. Then #m_1-m_2# is a polynomial of lower degree that also satisfies #(m_1-m_2)(A) = 0# (for #(m_1-m_2)(A) =m_1(A)-m_2(A)#). If this difference is non-zero, then dividing it by the leading coefficient, say #c#, leads to a polynomial which must be zero because of the assumption of minimal degree. Therefore \[\frac{1}{c}(m_1(x)-m_2(x)) = 0\] We conclude that #m_1(x) = m_2(x)#. This establishes the uniqueness of #m#.

2. Let #T# be an invertible #(n\times n)#-matrix and #f(x)# a polynomial. Then #f(A) = 0# holds if and only if #f(T\,A\, T^{-1}) = 0#. Indeed, for a certain degree #n# the polynomial #f(x)# has the form \[f(x)=c_nx^n+c_{n-1}x^{n-1}+\cdots+c_1x+c_0\] so \[\begin{array}{rcl}f(T\,A\, T^{-1})&=&c_n(T\,A\, T^{-1})^n+c_{n-1}(T\,A\, T^{-1})^{n-1}+\cdots+c_1T\,A\, T^{-1}+c_0\\&=&c_n\underbrace{T\,A\, T^{-1}T\,A\, T^{-1}\cdots T\,A\, T^{-1}}_{n\text{ times }T\,A\,T^{-1}}+{}\\&&{}+c_{n-1}\underbrace{T\,A\, T^{-1}T\,A\, T^{-1}\cdots T\,A\, T^{-1}}_{n-1\text{ times }T\,A\,T^{-1}}+\cdots+c_1T\,A\, T^{-1}+c_0\\&=&c_nT\underbrace{A\,A\,\cdots A}_{n\text{ times }A}T^{-1}+c_{n-1}T\underbrace{A\,A\,\cdots A}_{n-1\text{ times }A}T^{-1}+\cdots+c_1T\,A\, T^{-1}+c_0\\&=&c_nTA^nT^{-1}+c_{n-1}TA^{n-1}T^{-1}+\cdots+c_1T\,A\, T^{-1}+c_0\\&=&T\left(c_nA^n+c_{n-1}A^{n-1}+\cdots+c_1A+c_0\right)T^{-1}\\&=&Tf(A)T^{-1}\end{array}\] It follows that the minimal polynomial of #A# is equal to the minimal polynomial of #T\,A\, T^{-1}#.

3. Now let #f# be an arbitrary polynomial with #f(A)=0#. Division with remainder by #m# gives polynomials #q# and #r# with \[ f (x)=q(x)\cdot m (x)+ r(x)\] such that the degree of #r# is smaller than the degree of #m#. We substitute #A# for #x# and rearrange the terms to get

\[r(A) = f(A) - q(A)\cdot m(A) = 0- q(A)\cdot 0 = 0\]

Because the degree of #r# is smaller than the degree of #m# and #m# is a polynomial of minimal degree with the property that #m(A) =0#, it follows that #r(x) = 0#, so #f(x) = q(x)\cdot m(x)#. This shows that #m# divides #f#.

4. Let #\lambda# be a root of the characteristic polynomial of #A#, such that #\det\left(A-\lambda\, I_n\right)=0#. According to Invertibility and rank it follows from #\det\left(A-\lambda\, I_n\right)=0# that the kernel of #A-\lambda \,I_n# contains at least one vector, say #\vec{v}#, unequal to the zero vector. This vector satisfies #\left(A-\lambda\, I_n\right)\vec{v}=\vec{0}#, thus \(A\vec{v}=\lambda\vec{v}\). It follows that \[\vec{0} = m_A(A)\vec{v} = m_A(\lambda)\vec{v} \] Because #\vec{v}\ne\vec{0}#, we find that \( m_A(\lambda)=0\), that is, #\lambda# is a root of the minimal polynomial of #A#.

The only matrix with minimal polynomial #m(x)=x# is the zero matrix.

If #n\gt1#, then all natural number less than or equal to #n# occur as the degree of a minimal polynomial of an #(n\times n)#-matrix. If

\[A =\matrix{0&1&0&0&\cdots&0\\ 0&0&1&0&\cdots&0\\ 0&0&\ddots&\ddots&\cdots&0 \\0&0&0&\ddots&1&0\\ 0&0&0&\ddots&0&1\\ 0&0&0&\cdots&0&0 }\] then #A# satisfies #A^n=0# but not #A^{n-1} = 0#. The degree can therefore be #n#.

Uniqueness Previously we saw that the conjugacy class of a square matrix #A# is not uniquely determined by the characteristic polynomial. The question now arises whether the characteristic polynomial and the minimal polynomial together determine the conjugacy class of #A# uniquely. The earlier example of the non-conjugated #(2\times2)#-matrix #N=\matrix{0&1\\ 0&0}# and the zero matrix is no longer a counterexample. The minimal polynomials of #N# and the zero matrix are in fact #x^2# and #x#, respectively, so the minimal polynomial is different on the conjugacy class of the two matrices. But, for the same #(2\times2)#-matrix #N# as above, the #(4\times4)#-matrices \[A=\matrix{N&0\\ 0&0}\phantom{xx}\text{ and } \phantom{xxx} B=\matrix{N&0\\ 0&N}\] both have characteristic polynomial #x^4# and minimal polynomial #x^2#, while they are not conjugate. This provides the answer to the question raised: the characteristic polynomial and the minimal polynomial together do not determine the conjugacy class of a square matrix uniquely.

Diagonal Form An immediate consequence of the last statement is that the minimal polynomial has degree #n# if all roots of the characteristic polynomial are different. Later we will see that a matrix of a linear map #L:V\to V# is conjugate to a diagonal matrix (only over the complex numbers if there are non-real roots) if and only if the minimal polynomial of #L# has no double roots.

The minimal polynomial of a square matrix #A# can be determined in at least two ways:

Compute the characteristic polynomial #p_A#. Find for the largest monic divisor #n_A# of #p_A# without double complex roots. Search among the monic divisors of #p_A# for a multiple of #n_A# of smallest degree such that #A# is a zero of it.
Find a linear relationship of the form #c_0\cdot I+c_1\cdot A+c_2\cdot A^2+\cdots +c_{k-1}\cdot A^{k-1}+A^k=0# for the smallest possible #k#. Then we must have #m_A (x)= c+c_1\cdot x+c_2\cdot x^2+\cdots+c_{k-1}\cdot x^{k-1}+x^k#.

The first method is feasible if #p_A# has many different roots. The second method is very straightforward. We give some examples.

Calculate the minimal polynomial #m_A(x)# of the matrix \[ A = \matrix{-2 & -1 & 0 \\ 4 & 2 & 1 \\ 3 & 2 & 1 \\ } \]

#m_A(x) =# #x^3-x^2-2 x-1#

We are looking for the lowest degree monic polynomial in #x# that becomes the zero matrix upon substitution of #A# for #x#. To this end, we first calculate the relevant powers of #A#:
\[\begin{array}{rcl}
A^2 &=& \matrix{0 & 0 & -1 \\ 3 & 2 & 3 \\ 5 & 3 & 3 \\ }\\ A^3 &=& \matrix{-3 & -2 & -1 \\ 11 & 7 & 5 \\ 11 & 7 & 6 \\ }
\end{array}\] We next consider the polynomial #a+b\cdot x+c\cdot x^2+d\cdot x^3# with coefficients #a#, #b#, #c#, #d# to be determined, such that the zero matrix appears after we substitute #A# for #x#. This gives
\[\matrix{-3 d-2 b+a & -2 d-b & -d-c \\ 11 d+3 c+4 b & 7 d+2 c+2 b+a & 5 d+3 c+b \\ 11 d+5 c+3 b & 7 d+3 c+2 b & 6 d+3 c+b+a \\ } = \matrix{0&0&0\\ 0&0&0\\ 0&0&0} \] This is a system of #9# linear equations with unknowns #a#, #b#, #c#, #d#. Its solution, written with #d# as a parameter is
\[ a=-d ,\phantom{xx} b=-2 d ,\phantom{xx} c=-d \] Apparently, there is a solution only if #d\ne0#. The minimal polynomial thus has degree #3#. Because the minimal polynomial #m_A(x)# is monic, we must take #d=1# to find the answer. This gives the solution \[ a=-1 ,\phantom{xx} b=-2 ,\phantom{xx} c=-1 \] so the answer is #m_A(x) = x^3-x^2-2 x-1#.

The characteristic polynomial of #A# is equal to #-1# times the minimal polynomial.

New example