We saw that a system of equations can be written as a matrix equation . If the rank of the coefficient matrix is equal to the rank of the augmented matrix , then the system has a solution .
The only alternative is that the rank of the coefficient matrix is smaller than the rank of the augmented matrix. In this case there is no solution. We now examine how we can find the best approximation to a solution. In other words, we seek a vector for which the distance between and the image under of the approximate solution is minimal.
Let be an -matrix and a vector of .
- The equation with unknown has a solution.
- Each solution of this equation gives a best estimate for the equation in the sense that is minimal for among all vectors of .
If the -matrix has rank , we can easily find the best approximation. After all, then the matrix is invertible and we can follow the following procedure:
- Calculate and .
- Compute the inverse of .
- Now is given by .
- Now is given by .
The fact that the condition on the rank is essential becomes clear when we look at the following example The matrix is given by This matrix is clearly not invertible, since its determinant is equal to .
This statement is especially useful when the system does not have a solution. If this system has a solution, that solution is also a solution of , so the least squares procedure is unnecessarily cumbersome.
Consider the inconsistent system
It is clear that this system does not have a solution. The last equation gives , which, in combination with the first, leads to . If we substitute this into the second equation, we get . Since we can not find an exact solution, we will look for a best approximation. To this end, we first write the system as the matrix equation with
We calculate and .
The inverse of the matrix is given by
Now we can calculate the vector :
The matrix maps this vector onto the vector , which is given by
A classic application of the least squares method has a statistical character. Let for be points of the -plane. The problem of linear regression asks for a line that best describes this collection of points. To this end, we write for an equation of the line, where and are yet to be determined real numbers. A best approximate solution of the following system is required.
In matrix form:
The least squares method will provide it.
The same method as linear regression works for finding the best approximation of a set of points for in the -plane by the graph of a polynomial function of higher degree. If we set , the corresponding matrix equation becomes In fact, this method is not limited to polynomials. For instance, exponential functions are often used in the modelling of growth of populations.
The equation is also called the normal equation of the matrix equation .
We will prove both parts, starting with the first.
1. Let be an -matrix and a vector from . We claim . Clearly , so to prove the assertion we only have to verify that . Suppose for that purpose that belongs to . In that case Since the inner product is positive-definite, it follows that , that is to say: belongs to . This proves the assertion .
We will now prove . Because is clear, it suffices for the proof to establish the inclusion . This follows from the following derivation: Thus, we have established the inclusion . According to properties of the perpendicular space we then also have The vector belongs to . Thus, there is a vector such that . This proves the first statement.
2. Suppose that is a solution of the equation . We claim that then is the orthogonal projection of onto . To see this, we compute This shows that is perpendicular to the image of . The vector belongs to and thus is the orthogonal projection of onto . This means that, among all images of vectors under , the vector is nearest to . Therefore, the solution gives the best approximation to a solution of the equation .
The method is called least squares method because of the application to linear regression in the plane, where the problem is to find, for a given set of points, a line that best describes the set as a part of the graph of a linear function. The solution is the line that minimizes the sum over this set of points of the squares of the vertical distances of the point to the line.
Find the least squares solution to the problem:
To find the least squares solution to this problem, we need to solve , where
We calculate and :
and
Therefore, the augmented matrix corresponding to is
Solving this with
row reduction, we find
Hence,
Because the matrix is invertible, we also could have calculated the least squares solution by inverting and performing the calculations as presented in the theory.