Kernel and image of a linear transformation

Linear maps: Linear maps

Kernel and image of a linear transformation

Let $V$ and $W$ be vector spaces. Because a linear mapping $V\to W$ is of course just a special case of a general mapping, we can talk for example about the image of a vector or the image of a subset of $V$ and about the full inverse image of a subset of $W$ .

Image and original

Let $L : V \rightarrow W$ be a mapping, $D$ a subset of $V$ , and $E$ a subset of $W$ .

By $L (D)$ we denote the image of $D$ under $L$ : the set $\left\{L(\vec{d})\mid \vec{d}\in D\right\}$ .
By $L^{-1}(E)$ we denote the full inverse image of $E$ under $L$ : the set $\left\{\vec{x}\in V\mid L(\vec{x})\in E\right\}$ .

Example

Let the mapping $L:\mathbb{R}\to\mathbb{R}$ be given by $L(x) = a\, x+ b$ .

The image of $\mathbb{R}$ under $L$ is equal to $\mathbb{R}$ if $a\ne0$ and equal to $\{b\}$ if $a=0$ .

The full inverse image of $\{c\}$ under $L$ consists of the solution of $a\,x+b=c$ and is thus equal to

$\left\{\frac{c-b}{a}\right\}$ if $a\ne0$
$\mathbb{R}$ if $a=0$ and $b=c$
$\emptyset$ if $a=0$ and $b\ne c$

We denote the full inverse image of $E$ under $L$ by $L^{-1}(E)$ or $L^{\leftarrow}(E)$ . The former notation is most common in mathematics, but may lead to confusion with the notation for the inverse of a mapping. To avoid confusion when using this notation we often add the meaning in words, such as "full inverse image $L^{-1}(E)$ of $E$ ".

Two important linear subspaces can be associated with linear mappings.

Image space and kernel

Let $L :V \rightarrow W$ be a mapping. Define
$\begin{array}{rclcl} \im{L}&=&L(V) &=& \{ L (\vec{v}) \in W\mid\vec{v}\in V\} \\ \text{and}&&&&\\ \ker{L}&=& L^{-1}(\{\vec{0}\}) &=& \{\vec{v}\in V \mid L( \vec{v})=\vec{0}\} \end{array}$
$\im{L}$ is called the image or the image space of $L$ and $\ker{L}$ is called the null space or kernel of $L$ .

The image space is the image of $V$ under the mapping $L$ and the null space is the full inverse image of $\{ \vec{0}\}$ under $L$ .

The first definition generalizes the column space of the coefficient matrix (that is to say: the span of the columns of the matrix) of a system of linear equations.

The second definition generalizes the solution space of a homogeneous system of linear equations.

As indicated before, we an $(m\times n)$ -matrix $A$ is also used to refer to the linear map $L_A:\mathbb{R}^n\to\mathbb{R}^m$ determined by it (given by $L_A(\vec{x}) = A\vec{x}$ ). This way image and kernel of $A$ als also defined: $\im{A}=\im{L_A}$ en $\ker{A} = \ker{L_A}$ .

As the name indicates, the image space and the null space of a linear mapping are linear subspaces:

Let $L :V \rightarrow W$ be a linear mapping.

The image $\im{L}$ of $L$ is a linear subspace of $W$ .
The kernel $\ker{L}$ of $L$ is a linear subspace of $V$ .

Example Let the mapping $L:\mathbb{R}\to\mathbb{R}$ be given by $L(x) = a\, x+ b$ . The image space $\im{L}$ equals $\mathbb{R}$ if $a\ne0$ . But if $a=0$ and $b\ne0$ (so $L$ is a constant mapping distinct from $0$ ), then the image space is $\{b\}$ , not a linear subspace. The kernel $\ker{L}$ consists of the solution of $a\,x+b=0$ and is thus equal to

$\left\{\frac{-b}{a}\right\}$ if $a\ne0$
$\mathbb{R}$ if $a=0$ and $b=0$
$\emptyset$ if $a=0$ and $b\ne 0$

In particular, $\ker{L}$ is not a linear subspace of $\mathbb{R}$ if $b\ne0$ . We see again that $L$ is linear only if $b=0$ .

The image space $\im{L}$ is a linear subspace of $W$ : It is a subset of $W$ . The zero vector of $W$ is the image of the zero vector of $V$ and thus belongs to $\im{L}$ . If $\vec{u}$ , $\vec{v}$ belong to $\im{L}$ and $\alpha$ and $\beta$ are scalars, then there are vectors $\vec{x}$ , $\vec{y}$ in $V$ such that $L(\vec{x}) = \vec{u}$ and $L(\vec{y}) = \vec{v}$ , so linearity of $L$ implies:

$\begin{array}{rcl}\alpha \vec{u}+\beta \vec{v} &=& \alpha L(\vec{x})+\beta L(\vec{y})\\ &=&L(\alpha \vec{x})+L(\beta \vec{y})\\&=&L(\alpha \vec{x}+\beta \vec{y})\end{array}$ We conclude that $\alpha \vec{u}+\beta \vec{v}$ belongs to $\im{L}$ , from which we deduce that $\im{L}$ is a linear subspace of $W$ .

The kernel $\ker{L}$ is a linear subspace of $V$ : It is a subset of $V$ which always contains the zero vector $\vec{0}$ . Further, it follows from the linearity of $L$ that if $\vec{x}$ and $\vec{y}$ belong to $\ker{L}$ and $\alpha$ and $\beta$ are scalars, we have

$L (\alpha \vec{x}+\beta \vec{y})=\alpha L(\vec{x})+\beta L(\vec{y})=\vec{0} + \vec{0}=\vec{0}$ so $\alpha \vec{x}+\beta \vec{y}\in{\ker{L}}$ .

We determine the kernel and the image space of $L_A:\mathbb{R}^3\to\mathbb{R}^2$ , the linear mapping determined by the $(2\times 3)$ -matrix $A=\matrix{ 1 & -1 & 2 \\ 1 & -1 & 2 }$ This means that, if we use column vectors, the mapping rule is $L_A \matrix{ x_1 \\ x_2 \\ x_3 } = \matrix{ 1 & -1 & 2 \\ 1 & -1 & 2 } \matrix{ x_1 \\ x_2 \\ x_3 }$ The null space of $L_A$ consists of all vectors $\vec{x}$ that satisfy $\matrix{ 1 & -1 & 2 \\ 1 & -1 & 2 } \matrix{ x_1 \\ x_2 \\ x_3 } = \matrix{ 0 \\ 0}$ This is a homogeneous system of linear equations with $A$ as a coefficient matrix. The null space is the plane with equation $x_1 -x_2 +2x_3 =0$ .

The image space $L_A$ consists of all vectors of the form
$\matrix{ 1 & -1 & 2 \\ 1 & -1 & 2 } \matrix{ x_1 \\ x_2 \\ x_3 }$ that is, vectors of the form
$x_1 \matrix {1 \\ 1 } + x_2 \matrix{ -1 \\ -1 } + x_3 \matrix{ 2 \\ 2 }$ This describes exactly the span of the columns of $A$ , that is, the column space. We conclude that the image space equals $\linspan{\cv{1\\ 1}}$

We also determine the full inverse image $L_A^{-1}(\ell)$ of the straight line $\ell$ with parametric representation
$\ell : \quad\vec{x} = \left(\begin{array}{c} 3\\ 2 \end{array} \right) + \lambda \left(\begin{array}{c} 2\\1 \end{array} \right)$ So we are looking for vectors $\vec{x}$ which satisfy
$A\vec{x} = \left( \begin{array}{c} 3+2\lambda \\ 2 + \lambda \end{array} \right)$ for some $\lambda$ . This means that we must solve the system with augmented matrix
$\left(\begin{array}{ccc|c} 1 & -1 & 2 & 3+2\lambda\\ 1 & -1 & 2 & 2+\lambda \\ \end{array} \right)$ By row reduction it is easy to deduce that this system has only solutions for $\lambda =-1$ . The solutions form the plane with equation $x_1-x_2+2x_3=1$ .

Can you see on the basis of the relative position of $\ell$ and the image space $\im{L_A}$ that the calculation of the full inverse image can be limited to the calculation of the full inverse image of the vector $\cv{1\\1}$ ?

If $A$ is a matrix, then the kernel of $L_A$ , the linear map determined by $A$ , is the solution space of a homogeneous system of linear equations with coefficient matrix $A$ , and the image of $L_A$ is the column space (which is the span of the columns of the matrix) of $A$ , as shown in the previous example.

The null space of $L_A$ consists of all vectors $\vec{x}$ that satisfy $L_A( \vec{x})=\vec{0}$ , that is to say, all solutions of the homogeneous system $A\vec{x}=\vec{0}$ .

The image space $L_A$ consists of all vectors of the form $L_A( \vec{x})$ . If $A$ has columns $\vec{a}_1,\ldots,\vec{a}_n$ , then this is exactly the subset
$\left\{ x_1 \vec{a}_1+\cdots+x_n\vec{a}_n \mid x_1, \ldots , x_n \in \mathbb{R}\right\}$ of $A$ . This is the column space of $A$ .

Null space and image space thus generalize two concepts from the world of matrices.

Consider the orthogonal projection ${ P}$ in $\mathbb{R}^2$ on a line $\ell = \langle\vec{a}\rangle$ through the origin. If we take the length of $\vec{a}$ to be equal to $1$ , we can describe $P$ algebraically by the mapping rule ${ P}\vec{x} =( \dotprod{\vec{x}}{\vec{a}})\cdot \vec{a}$

The image space of $P$ is equal to $\ell$ . This is geometrically obvious (each point is projected onto $\ell$ ), and can be derived algebraically from the mapping rule. This does in fact show that each vector in the image is a scalar multiple of $\vec{a}$ . Because the image is a linear subspace and $P(\vec{a}) = \vec{a}$ , the image space should coincide with $\linspan{\vec{a}} = \ell$ .
The null space is the line $\ell^{\perp}$ that is perpendicular to $\ell$ and passes through the origin. Indeed, the null space consists of all vectors $\vec{x}$ for which $(\dotprod{\vec{x}}{\vec{a}})\cdot \vec{a} =\vec{0}$ , that is, $\dotprod{\vec{x}}{\vec{a}} =0$ , which is precisely the orthoplement of $\vec{a}$ .

We discuss The link between systems of linear equations and affine subspaces again; this time from the point of view of linear mappings.

Inverse images of linear mappings are affine subspaces

Let $L :V \rightarrow W$ be a linear mapping and consider the vector equation $L (\vec{x})=\vec{b}$ .

The solution of the equation is equal to the full inverse image $L^{-1}(\vec{b})$ of $\vec{b}$ under $L$ . In particular:

If $\vec{b} \not\in \im{L}$ , then $L^{-1}(\vec{b})$ is empty and the equation has no solution.
If $\vec{b} \in \im{L}$ , then there is a vector $\vec{p}$ such that $L( \vec{p})=\vec{b}$ . The vector $\vec{p}$ is a particular solution of the vector equation $L \vec{x}=\vec{b}$ . All solutions (that is, the general solution) of this vector equation form the affine subspace
$\vec{p}+\ker{L} = \left\{\vec{p}+\vec{n}\mid\vec{n}\in{ \ker{L}}\right\}$ with support vector $\vec{p}$ and direction space $\ker{L}$ .

In general, the particular solution is not unique: each solution can act as particular solution. Likewise, each vector in the affine subspace $\vec{p}+\ker{L}$ can act as support vector.

The theorem General and particular solution states that we we can find all solutions of the vector equation $L(\vec{x})=\vec{b}$ by adding a particular solution to the solutions of the corresponding homogeneous equation. The corresponding homogeneous equation is (by definition) the equation $L (\vec{x})=\vec{0}$ . The set of all solutions of this equation is the null space $\ker{ L}$ . In particular, the equation $L( \vec{x})=\vec{b}$ has at most one solution if $\ker{L}=\{\vec{0}\}$ .

The first part of the statement is trivial. For the proof of the second part, we let $\vec{p}$ be a particular solution. For each $\vec{n}$ , we have $L (\vec{p}+\vec{n})= L (\vec{p})+ L( \vec{n})=\vec{b}+\vec{0}=\vec{b}$ . Therefore, $\vec{p}+\vec{n}$ is also a solution. Conversely, if $\vec{q}$ is a solution, then $L (\vec{q}-\vec{p})= L( \vec{q})- L( \vec{p})=\vec{b}-\vec{b}=\vec{0}$ , so $\vec{q}-\vec{p}\in \ker{L}$ . Since $\vec{q}=\vec{p}+(\vec{q}-\vec{p})$ the vector $\vec{q}$ is indeed the sum of $\vec{p}$ and a vector from the kernel. We conclude that the set of all solutions is the affine subspace $\vec{p}+\ker{L}$ .

This property is often used in solving linear differential equations.