Characteristic polynomial of a matrix

Matrix calculus: Matrices and coordinate transformations

Characteristic polynomial of a matrix

Previously we saw that a linear map $L:V\to V$ where $V$ is a finite-dimensional vector space, is determined by a square matrix. Such a matrix is uniquely determined after selection of a basis $\alpha$ for $V$ . For capturing properties of such a linear map by use of the matrix $L_\alpha$ , we have to look at functions of square matrices that do not depend on the choice of basis, that is, functions having the same value on $L_\beta$ as on $L_\alpha$ for any other basis $\beta$ of $V$ . The rank is such a function, but the concept of characteristic polynomial leads to several such functions, as we will see later.

Let $A$ be an $(n\times n)$ -matrix. Then

$\det (A-x I_n)$ is a polynomial in $x$ of degree $n$ . This polynomial is called the characteristic polynomial of $A$ and is denoted by $p_A(x)$ .

Indicate by $a_{ij}$ the $(i,j)$ -element of $A$ .

The leading coefficient of this polynomial is $(-1)^n$ .
The coefficient of $x^{n-1}$ is equal to $(-1)^{n-1}(a_{11}+a_{22}+\cdots +a_{nn})$ .
The constant term is equal to $\det(A)$ .

The sum $a_{11}+a_{22}+\cdots +a_{nn}$ of the diagonal entries of $A$ is called the trace of the matrix and is often indicated by $\text{tr}(A)$ .

The characteristic polynomial is the determinant

$\left|\,\begin{array}{cccc} a_{11}-x & a_{12} & \cdots & a_{1n}\\ a_{21} & a_{22}-x & & \vdots \\ \vdots & & \ddots & \vdots\\ a_{n1} & \cdots & \cdots & a_{nn}-x \end{array}\,\right|$ This determinant is the sum of $n!$ terms, and each term consists of a product of $n$ matrix entries that are selected in such a way that from each row and each column exactly one element is taken. Therefore, each term in this sum is a polynomial in $x$ of degree at most $n$ . One of the terms is:

$\begin{array}{l} (a_{11}-x)(a_{22}-x)\cdots (a_{nn}-x) \\ \quad \quad= (-1)^nx^n+(-1)^{n-1}(a_{11}+a_{22}+\cdots +a_{nn})x^{n-1}+\cdots \end{array}$ This term corresponds to the identity permutation of $\{1,\ldots,n\}$ . Each of the remaining terms contains a factor $a_{ij}$ with $i\neq j$ . In such a term, the factors $(a_{ii}-x)$ and $(a_{jj}-x)$ do not occur because they are in the same row respectively column as $a_{ij}$ . Instead, factors appear that are entries of $A$ off the diagonal, and so such a term is a polynomial of degree at most $n-2$ . Consequently, the characteristic polynomial is a polynomial of degree $n$ of the form

$(-1)^nx^n+(-1)^{n-1}(a_{11}+a_{22}+\cdots +a_{nn})x^{n-1}+\cdots +c_1x +c_0$ This shows that the leading coefficient and the coefficient of $x^{n-1}$ are as stated. In order to find the constant term $c_0$ , we substitute $x =0$ . Then, the polynomial $\det (A-x I_n)$ becomes equal to $c_0 =\det(A)$ .

The characteristic polynomial of a $(2\times2)$ -matrix $A$ is equal to $x^2-\text{tr}(A)x+\det(A)$ , so it is completely determined by the trace and determinant.

The notation $\text{tr}(A)$ for the trace of $A$ refers to the word trace.

The significance of this characteristic polynomial is that it helps to determine many important properties for the linear map corresponding to $A$ , like the set of vectors that are mapped onto a scalar multiple of themselves.

The solutions of the equation that arises when we equate the characteristic polynomial to $0$ , are a key to the determination of a simple matrix form of the linear map determined by $A$ . They will appear later under the name eigenvalues.

Characteristic equationThe polynomial equation

$\det(A-xI_n) = 0$ with unknown $x$ is called the characteristic equation of $A$ .

The trace of $A$ is the sum of the complex roots of the characteristic equation.
The determinant of $A$ is the product of the complex roots of the characteristic equation.

Let $x_1,\ldots,x_n$ be the $n$ (complex) roots of the characteristic equation. Write the characteristic polynomial as

$\begin{array}{rcl}\det(A-xI_n) &=& (-1)^n(x-x_1)(x-x_2)\cdots (x-x_n)\\ &=& (-1)^nx^n+(-1)^{n-1}(x_1+x_2+\cdots+ x_n)x^{n-1}+\cdots +x_1x_2\cdots {}x_n \end{array}$

Comparison of the coefficient of $x^{n-1}$ in this expression and the above formula for the characteristic polynomial shows that $x_1+x_2+\cdots+ x_n$ is the trace of $A$ .
Comparison of the constant term $x_1x_2\cdots {}x_n$ with the constant term in the above formula for the characteristic polynomial shows that $x_1x_2\cdots{} x_n$ is the determinant of $A$ .

Dimension 2As we have seen above, the characteristic polynomial of

$A=\matrix{a&b\\ c&d}$ is equal to

$x^2-\text{tr}(A)x+\det(A)=x^2-(a+d)x+(a\cdot d - b\cdot c)$ The abc-formula yields the solutions of the characteristic equation

$x_1 = \dfrac{a+d-\sqrt{(a-d)^2+4bc}}{2}\phantom{x}\text{ and }\phantom{x} x_2 = \dfrac{a+d+\sqrt{(a-d)^2+4bc}}{2}$ We see that, indeed,

$\begin{array}{rcl} x_1+x_2 &=& a+d=\text{tr}(A)\\ x_1\cdot x_2 &=& \dfrac{(a+d)^2-(a-d)^2-4bc}{4} =ad-bc = \det(A)\end{array}$ We deal individually with each of the three cases for the solutions of a quadratic equation using the discriminant $(a-d)^2+4bc$ .

If the discriminant is positive, there are two distinct real roots. One example is the diagonal matrix $A = \matrix{a&0\\ 0&d}$ in which case the characteristic polynomial is $(x-a)\cdot(x-d)$ , so the entries $a$ and $d$ on the diagonal are the roots of the characteristic equation.
If the discriminant is equal to $0$ , then the roots coincide. We then count this root double. An example is $A = \matrix{0&b\\ 0&0}$ The characteristic equation is $x^2=0$ , so the roots are both equal to $0$ . If we choose $b\ne0$ , then $A$ is distinct from the zero matrix, while the characteristic polynomial does not differ from the characteristic polynomial of the zero matrix.
If the discriminant $(a-d)^2+4bc$ of the quadratic equation in $x$ is negative, then the solutions are complex. For example, the matrix of a rotation by $\varphi$ around the origin in $\mathbb{R}^2$ is $A = \matrix{\cos(\varphi)&-\sin(\varphi)\\\sin(\varphi)&\cos(\varphi)}$ The discriminant is equal to $-4\sin^2(\varphi)$ and the solutions of the characteristic polynomial are complex: $x_1 =\cos(\varphi)-\sin(\varphi)\ii\quad\text{and}\quad x_2 =\cos(\varphi)+\sin(\varphi)\ii$

Since the determinant of a triangular matrix equals the product of the diagonal entries, these diagonal entries form the unique solutions to the corresponding characteristic equation. Indeed, if $A$ is an $(n\times n)$ -triangular matrix, then so is $A-xI_n$ , yielding the following characteristic equation

$(a_{11}-x)(a_{22}-x)\cdots (a_{nn}-x)=0$ where $a_{11},\ldots,a_{nn}$ are the diagonal entries of $A$ . The characteristic equation is satisfied if and only if $x$ equals one of the diagonal entries.

Block matrix According to the first statement from Determinants of some special matrices, the determinant of a square matrix of the form

$M = \matrix{A&C\\ 0&B}$ in which $A$ and $B$ are square submatrices and $C$ is an arbitrary matrix of appropriate dimensions, equals:

$\det(M)=\det(A)\cdot\det(B)$ Therefore, if $M$ is an $(n\times n)$ -matrix, $A$ an $(m\times m)$ -matrix, and $B$ a $(k\times k)$ -matrix, then the characteristic polynomial of $M$ is

$\det(M-xI_n)=\det(A-xI_m)\cdot\det(B-xI_k)$ The solutions of the characteristic equation of $M$ are therefore equal to the sum of the solutions of the characteristic equations of $A$ and $B$ .

The value $x=0$ is a solution to the characteristic equation $\det(A-xI_n)=0$ if and only if $\det(A)=0$ . For $n=2$ , the other solution then equals $x=\text{tr}(A)$ . For example, the solutions to the characteristic equation of the matrix

$A=\matrix{1&2\\2&4}$ are $x=0$ and $x=5$ .

Determine the characteristic polynomial of

$A=\matrix{3 & 1 & -1 \\ 0 & -2 & 0 \\ 0 & -6 & 4 \\ }$

$p_A(x) =$ $-x^3+5 x^2+2 x-24$

We calculate the characteristic polynomial according to its definition:

$\begin{array}{rcl} p_A(x) &=&\det(A-x\,I_3)\\&=& \left|\begin{array}{ccc}3-x & 1 & -1 \\ 0 & -x-2 & 0 \\ 0 & -6 & 4-x \end{array}\right| \\ &=& -x^3+5 x^2+2 x-24 \end{array}$

New example