Previously we saw that a linear map #L:V\to V# where #V# is a finite-dimensional vector space, is determined by a square matrix. Such a matrix is uniquely determined after selection of a basis #\alpha# for #V#. For capturing properties of such a linear map by use of the matrix #L_\alpha#, we have to look at functions of square matrices that do not depend on the choice of basis, that is, functions having the same value on #L_\beta# as on #L_\alpha# for any other basis #\beta# of #V#. The rank is such a function, but the concept of characteristic polynomial leads to several such functions, as we will see later.
Let # A# be an #(n\times n)#-matrix. Then \[\det (A-x I_n)\] is a polynomial in #x# of degree #n#. This polynomial is called the characteristic polynomial of #A# and is denoted by #p_A(x)#.
Indicate by #a_{ij}# the #(i,j)#-element of #A#.
- The leading coefficient of this polynomial is #(-1)^n#.
- The coefficient of #x^{n-1}# is equal to #(-1)^{n-1}(a_{11}+a_{22}+\cdots +a_{nn})#.
- The constant term is equal to #\det(A)#.
The sum # a_{11}+a_{22}+\cdots +a_{nn}# of the diagonal entries of #A# is called the trace of the matrix and is often indicated by #\text{tr}(A) #.
The characteristic polynomial is the determinant
\[
\left|\,\begin{array}{cccc}
a_{11}-x & a_{12} & \cdots & a_{1n}\\
a_{21} & a_{22}-x & & \vdots \\
\vdots & & \ddots & \vdots\\
a_{n1} & \cdots & \cdots & a_{nn}-x
\end{array}\,\right|
\] This determinant is the sum of #n!# terms, and each term consists of a product of #n# matrix entries that are selected in such a way that from each row and each column exactly one element is taken. Therefore, each term in this sum is a polynomial in #x# of degree at most #n#. One of the terms is:
\[
\begin{array}{l}
(a_{11}-x)(a_{22}-x)\cdots (a_{nn}-x) \\
\quad \quad= (-1)^nx^n+(-1)^{n-1}(a_{11}+a_{22}+\cdots +a_{nn})x^{n-1}+\cdots
\end{array}
\] This term corresponds to the identity permutation of #\{1,\ldots,n\}#. Each of the remaining terms contains a factor #a_{ij}# with #i\neq j#. In such a term, the factors #(a_{ii}-x)# and #(a_{jj}-x)# do not occur because they are in the same row respectively column as #a_{ij}#. Instead, factors appear that are entries of #A# off the diagonal, and so such a term is a polynomial of degree at most #n-2#. Consequently, the characteristic polynomial is a polynomial of degree #n# of the form
\[
(-1)^nx^n+(-1)^{n-1}(a_{11}+a_{22}+\cdots +a_{nn})x^{n-1}+\cdots +c_1x +c_0
\] This shows that the leading coefficient and the coefficient of #x^{n-1}# are as stated. In order to find the constant term #c_0#, we substitute #x =0#. Then, the polynomial #\det (A-x I_n)# becomes equal to #c_0 =\det(A)#.
The characteristic polynomial of a #(2\times2)#-matrix #A# is equal to #x^2-\text{tr}(A)x+\det(A)#, so it is completely determined by the trace and determinant.
The notation #\text{tr}(A)# for the trace of #A# refers to the word trace.
The significance of this characteristic polynomial is that it helps to determine many important properties for the linear map corresponding to #A#, like the set of vectors that are mapped onto a scalar multiple of themselves.
The solutions of the equation that arises when we equate the characteristic polynomial to #0#, are a key to the determination of a simple matrix form of the linear map determined by #A#. They will appear later under the name eigenvalues.
The polynomial equation \[\det(A-xI_n) = 0\] with unknown #x# is called the characteristic equation of #A#.
- The trace of #A# is the sum of the complex roots of the characteristic equation.
- The determinant of #A# is the product of the complex roots of the characteristic equation.
Let #x_1,\ldots,x_n# be the #n# (complex) roots of the characteristic equation. Write the characteristic polynomial as \[ \begin{array}{rcl}\det(A-xI_n) &=& (-1)^n(x-x_1)(x-x_2)\cdots (x-x_n)\\ &=& (-1)^nx^n+(-1)^{n-1}(x_1+x_2+\cdots+ x_n)x^{n-1}+\cdots +x_1x_2\cdots {}x_n \end{array} \]
- Comparison of the coefficient of #x^{n-1}# in this expression and the above formula for the characteristic polynomial shows that #x_1+x_2+\cdots+
x_n# is the trace of #A#.
- Comparison of the constant term #x_1x_2\cdots {}x_n # with the constant term in the above formula for the characteristic polynomial shows that #x_1x_2\cdots{} x_n# is the determinant of #A#.
As we have seen above, the characteristic polynomial of \[A=\matrix{a&b\\ c&d}\]is equal to \[x^2-\text{tr}(A)x+\det(A)=x^2-(a+d)x+(a\cdot d - b\cdot c)\] The abc-formula yields the solutions of the characteristic equation\[x_1 = \dfrac{a+d-\sqrt{(a-d)^2+4bc}}{2}\phantom{x}\text{ and }\phantom{x} x_2 = \dfrac{a+d+\sqrt{(a-d)^2+4bc}}{2}\]We see that, indeed,\[\begin{array}{rcl} x_1+x_2 &=& a+d=\text{tr}(A)\\ x_1\cdot x_2 &=& \dfrac{(a+d)^2-(a-d)^2-4bc}{4} =ad-bc = \det(A)\end{array}\]We deal individually with each of the three cases for the solutions of a quadratic equation using the discriminant #(a-d)^2+4bc#.
- If the discriminant is positive, there are two distinct real roots. One example is the diagonal matrix \[A = \matrix{a&0\\ 0&d}\] in which case the characteristic polynomial is #(x-a)\cdot(x-d)#, so the entries #a# and #d# on the diagonal are the roots of the characteristic equation.
- If the discriminant is equal to #0#, then the roots coincide. We then count this root double. An example is \[A = \matrix{0&b\\ 0&0}\] The characteristic equation is #x^2=0#, so the roots are both equal to #0#. If we choose #b\ne0#, then #A# is distinct from the zero matrix, while the characteristic polynomial does not differ from the characteristic polynomial of the zero matrix.
- If the discriminant #(a-d)^2+4bc# of the quadratic equation in #x# is negative, then the solutions are complex. For example, the matrix of a rotation by #\varphi# around the origin in #\mathbb{R}^2# is \[A = \matrix{\cos(\varphi)&-\sin(\varphi)\\\sin(\varphi)&\cos(\varphi)}\] The discriminant is equal to #-4\sin^2(\varphi)# and the solutions of the characteristic polynomial are complex: \[x_1 =\cos(\varphi)-\sin(\varphi)\ii\quad\text{and}\quad x_2 =\cos(\varphi)+\sin(\varphi)\ii\]
Since the determinant of a triangular matrix equals the product of the diagonal entries, these diagonal entries form the unique solutions to the corresponding characteristic equation. Indeed, if #A# is an #(n\times n)#-triangular matrix, then so is #A-xI_n#, yielding the following characteristic equation \[(a_{11}-x)(a_{22}-x)\cdots (a_{nn}-x)=0\] where #a_{11},\ldots,a_{nn}# are the diagonal entries of #A#. The characteristic equation is satisfied if and only if #x# equals one of the diagonal entries.
According to the first statement from Determinants of some special matrices, the determinant of a square matrix of the form\[M = \matrix{A&C\\ 0&B}\] in which #A# and #B# are square submatrices and #C# is an arbitrary matrix of appropriate dimensions, equals:\[\det(M)=\det(A)\cdot\det(B)\]Therefore, if #M# is an #(n\times n)#-matrix, #A# an #(m\times m)#-matrix, and #B# a #(k\times k)#-matrix, then the characteristic polynomial of #M# is\[\det(M-xI_n)=\det(A-xI_m)\cdot\det(B-xI_k)\]The solutions of the characteristic equation of #M# are therefore equal to the sum of the solutions of the characteristic equations of #A# and #B#.
The value #x=0# is a solution to the characteristic equation #\det(A-xI_n)=0# if and only if #\det(A)=0#. For #n=2#, the other solution then equals #x=\text{tr}(A)#. For example, the solutions to the characteristic equation of the matrix \[A=\matrix{1&2\\2&4}\] are #x=0# and #x=5#.
Determine the characteristic polynomial of \[A=\matrix{3 & 0 & 0 \\ 0 & -3 & 0 \\ 0 & -7 & 4 \\ }\]
\(p_A(x) = \) \(-x^3+4 x^2+9 x-36\)
We calculate the characteristic polynomial according to its definition:
\[\begin{array}{rcl}
p_A(x) &=&\det(A-x\,I_3)\\&=& \left|\begin{array}{ccc}3-x & 0 & 0 \\ 0 & -x-3 & 0 \\ 0 & -7 & 4-x
\end{array}\right| \\ &=& -x^3+4 x^2+9 x-36
\end{array}
\]