Quadratic forms

Orthogonal and symmetric maps: Applications of symmetric maps

Quadratic forms

Symmetric maps can be used to define quadratic forms. A quadratic form on a vector space $V$ is a second-degree homogeneous polynomial in the coordinates of a vector with respect to a fixed basis of $V$ . We begin with a more intrinsic definition for the case of a real vector space. To this end we use the term polarization for the right hand side of the polarization formula of an inner product.

A quadratic form on a real vector space $V$ is a function $q:V\to\mathbb{R}$ with the following two properties:

homogeneity: For each scalar $\lambda$ and each vector $\vec{x}$ of $V$ we have $q(\lambda\cdot \vec{x}) = \lambda^2\cdot q(\vec{x})$ .
bilinearity of polarization: The real-valued map $f_q$ on pairs of vectors from $V$ defined by $f_q(\vec{x},\vec{y}) =\frac12\left( q(\vec{x}+\vec{y}) - q(\vec{x})-q(\vec{y})\right)$ is bilinear.

The bilinear map $f_q$ is symmetric and uniquely determined by $q$ ; it is called the bilinear form of $q$ .

If $g$ is a symmetric bilinear form on $V$ , then $r(\vec{x}) =g(\vec{x},\vec{x})$ is a quadratic form. Each quadratic form can be obtained in this way. Moreover, $g$ is the bilinear form of $r$ . We call $r$ the quadratic form defined by $g$ .

Suppose that $q$ is a quadratic form on a vector space $V$ . Then the associated bilinear form $f_q$ satisfies

$\begin{array}{rcl}f_q( \vec{x} ,\vec{x} )&=&\frac12\left(q( 2\vec{x}) -2q(\vec{x} )\right)\\&&\phantom{xx}\color{blue}{\text{definition }f_q( \vec{x} ,\vec{y} )\text{ with }\vec{y}=\vec{x}}\\&=&\frac12\left(4q(\vec{x}) - 2q(\vec{x} )\right)\\&&\phantom{xx}\color{blue}{q(\lambda\vec{x}) = \lambda^2q(\vec{x})}\\&=& q(\vec{x} )\\&&\phantom{xx}\color{blue}{\text{simplified}}\end{array}$

This shows that $q(\vec{x}) = f_q(\vec{x},\vec{x})$ is uniquely determined by $f_q$ .

Now let $g$ be a symmetric bilinear form on $V$ . Then $r(\vec{x}) =g(\vec{x},\vec{x})$ is homogeneous because of the bilinearity of $g$ :

$r(\lambda\vec{x})= g(\lambda\vec{x},\lambda\vec{x})=\lambda^2 g(\vec{x},\vec{x})=\lambda^2 r(\vec{x})$

The polarization of $r$ is bilinear because of

$\begin{array}{rcl}f_r(\vec{x},\vec{y}) &=&\frac12\left( r(\vec{x}+\vec{y}) - r(\vec{x})-r(\vec{y})\right)\\&&\phantom{xxx}\color{blue}{\text{definition }f_r}\\ &=&\frac12\left( g(\vec{x}+\vec{y},\vec{x}+\vec{y}) - g(\vec{x},\vec{x})-g(\vec{y},\vec{y})\right)\\&&\phantom{xxx}\color{blue}{\text{definition }r}\\&=&\frac12\left(g(\vec{x},\vec{x})+g(\vec{x},\vec{y})+g(\vec{y},\vec{x})+g(\vec{y},\vec{y})-g(\vec{x},\vec{x})-g(\vec{y},\vec{y})\right)\\&&\phantom{xxx}\color{blue}{\text{bilinearity }g}\\&=&\frac12\left(g(\vec{x},\vec{y})+g(\vec{y},\vec{x})\right)\\&&\phantom{xxx}\color{blue}{\text{simplified}}\\&=&g(\vec{x},\vec{y})\\&& \phantom{xxx}\color{blue}{\text{symmetry }g}\end{array}$ This shows that $r$ is a quadratic form, and that the corresponding bilinear form is equal to $g$ .

According to the first statement, every quadratic form $q$ with bilinear form $f_q$ can be obtained as the quadratic form determined by $f_q$ .

For each quadratic form $q$ we have $q(\vec{0}) = 0$ . This follows from the homogeneity of $q$ with $\lambda=0$ .

Let $V$ be the inner product space $\mathbb{R}$ . Each bilinear form on $V$ is of the form $g(x,y) = a\cdot x\cdot y$ for a constant real number $a$ . In order to see this, put $a = g(1,1)$ . Then, by the bilinearity of the form, we have $g(x,y) = x\cdot g(1,1)\cdot y = a\cdot x\cdot y$ . In particular, each bilinear form on a $1$ -dimensional vector space is symmetric.

Let $q$ be a quadratic form on $V$ . Then, there is a constant $a$ such that $q(x)= a \cdot x^2$ for every real number $x$ . The corresponding bilinear form $f_q(x,y) =\frac12\left(a\cdot (x+y)^2 -a\cdot x^2-a\cdot y^2 )\right)= a\cdot x\cdot y$ is positive-definite if and only if $a\gt0$ . Hence, there are quadratic forms whose bilinear forms are not inner products.

The function $q$ on $\mathbb{R}^2$ determined by $q(\rv{x,y}) = 2x^2-4xy+5y^2$ is a quadratic form and the corresponding bilinear form is

$\begin{array}{rcl}f(\rv{x,y},\rv{u,v}) &=& \frac12\left( q(\rv{x,y}+\rv{u,v}) - q(\rv{x,y})-q(\rv{u,v})\right)\\ &=&(x+u)^2-2(x+u)\cdot(y+v)+\frac52(y+v)^2)\\&&\phantom{XXX}-(x^2-2xy+\frac52y^2)\\&&\phantom{XXX}-(u^2-2uv+\frac52v^2)\\&=&2xu-2xv-2uy+5yv\end{array}$

Indeed, $f(\rv{x,y},\rv{x,y})=2x^2-2xy-2xy+5y^2$ coincides with $q(\rv{x,y})$ .

We will show homogeneity is needed for the definition of a quadratic form. We have seen that, for any quadratic form $q$ on $\mathbb{R}$ , there is a constant $a$ such that $q(x)= a \cdot x^2$ . In particular, $r(x) = x^2 + x$ is not a quadratic form. Yet $r$ satisfies the second condition (bilinearity of polarization) for a quadratic form, since $f_r(x,y) = \frac12\left(r(x+y)-r(x)-r(y)\right) = x\cdot y$ is a symmetric bilinear function in $x$ and $y$ . This shows that the omission of the condition of homogeneity substantially changes the definition of a quadratic form.

If $q$ is a quadratic form on a vector space $V$ with bilinear form $f_q$ , then $f_q$ is symmetric. But $f_q$ is not necessarily positive-definite. The form $f_q$ is positive-definite (and therefore also an inner product on $V$ ) if and only if $q(\vec{x})\ge 0$ and the equality $q(\vec{x}) = 0$ holds only for $\vec{x} = \vec{0}$ .

If $V = \mathbb{R}^n$ , for a vector $\vec{x} = \rv{x_1,\ldots,x_n}$ we also write $q(x_1,\ldots,x_n)$ instead of $q(\vec{x})$ . For example, $q(x,y,z) = q(\rv{x,y,z})$ for $\rv{x,y,z}\in\mathbb{R}^3$ .

The definition for complex vector spaces is the same with the understanding that the map $q$ has range $\mathbb{C}$ .

We will now show how the homogeneous polynomials of degree $2$ appear after a basis has been fixed. Recall from Coordinatization that, if $\alpha$ is a basis of $V$ , the map $\alpha:V\to\mathbb{R}^n$ , where $n=\dim{V}$ , assigns the coordinate vector of a vector of $V$ with respect to $\alpha$ .

Let $V$ be a vector space of finite dimension $n$ with basis $\alpha$ and $q$ a quadratic form on $V$ .

If $f$ is the bilinear form of $q$ , then there is a unique symmetric matrix $A$ such that, for all $\vec{u},\vec{v}\in V$ , we have $f(\vec{u},\vec{v}) =\dotprod{\alpha( \vec{u}) }{( A\,\alpha( \vec{v}))}$ We call $A$ the matrix of $q$ with respect to $\alpha$ . In particular, $q\circ\alpha^{-1}$ is of the form $q(\alpha^{-1}(\vec{x})) =\dotprod{ \vec{x} }{( A\,\vec{x})} = \sum_{i,j=1}^n a_{ij}x_i x_j\phantom{xx}\text{for }\vec{x} = \rv{x_1,\ldots,x_n}\in\mathbb{R}^n$ where $a_{ij}$ is the $(i,j)$ -entry of $A$ .
If $\beta$ is another basis of $V$ , then the matrix $B$ of $q$ with respect to $\beta$ is given by $B={}_\alpha I_{\beta}^\top\, A\,\; {}_\alpha I_{\beta}$
There exists a basis $\beta$ for $V$ such that the transformation matrix ${}_\alpha I_\beta$ is orthogonal and the matrix $B$ of $q$ with respect to $\beta$ is diagonal. In particular, $q\circ\beta^{-1}$ is of the form $q(\beta^{-1}(\vec{x})) = \sum_{i=1}^n b_ix_i^2\phantom{xx}\text{for }\rv{x_1,\ldots,x_n}\in\mathbb{R}^n$ where $b_i$ are the eigenvalues of $A$ . We call such a form a diagonal form of $q$ .
The bilinear form $f$ of $q$ is an inner product on $V$ if and only if all of the eigenvalues of $A$ are positive.

The formula for $q(\alpha^{-1}(\vec{x}))$ shows that the quadratic form is a polynomial in the coordinates of $\vec{x}\in\mathbb{R}^n$ .

The formula for $q(\beta^{-1}(\vec{x}))$ reveals that, relative to a suitable basis, the polynomial may be written as a sum of squared terms, that is to say, of the form $b_i\cdot x_i^2$ .

By replacing $\vec{x}$ by $\beta(\vec{x})$ in the formula $q(\beta^{-1}(\vec{x})) = \sum_{i=1}^n b_ix_i^2$ , we get

$q(\vec{x}) = \sum_{i=1}^n b_i\cdot (\beta (\vec{x})_i)^2$

Thus, $q(\vec{x})$ itself can also be written as a linear combination of squares.

We prove each of the statements individually.

1. Let $\vec{e}_1,\ldots,\vec{e}_n$ be a standard basis of $\mathbb{R}^n$ and put $a_{ij}= f(\alpha^{-1}(\vec{e}_i),\alpha^{-1}(\vec{e}_j))$ . Further, let $A$ be the $(n\times n)$ -matrix with $(i,j)$ -entry $a_{ij}$ . Then, for $\vec{x} = \rv{x_1,\ldots,x_n}$ in $\mathbb{R}^n$ , we have

$\begin{array}{rcl}q(\alpha^{-1}(\vec{x})) &=& f( \alpha^{-1}(\vec{x}),\alpha^{-1}(\vec{x}))\\ &&\phantom{xxx}\color{blue}{f \text{ is the bilinear form of }q}\\ &=&\displaystyle f\left( \alpha^{-1}\left(\sum_{i=1}^nx_i\vec{e}_i\right),\alpha^{-1}\left(\sum_{j=1}^nx_j\vec{e}_j\right)\right)\\ &&\phantom{xxx}\color{blue}{\text{definition }\vec{x}}\\ &=&\displaystyle \sum_{i,j=1}^nx_ix_j\cdot f\left( \alpha^{-1}(\vec{e}_i),\alpha^{-1}(\vec{e}_j)\right)\\ &&\phantom{xxx}\color{blue}{\text{bilinearity }f\text{ and linearity }\alpha^{-1}}\\ &=&\displaystyle \sum_{i,j=1}^nx_ix_j\cdot a_{ij}\\ &&\phantom{xxx}\color{blue}{\text{definition }a_{ij}}\\&=&\displaystyle \sum_{i,j=1}^nx_ix_j\cdot\dotprod{( \vec{e}_i}{(A\,\vec{e}_j))}\\ &&\phantom{xxx}\color{blue}{\text{definition }A}\\&=&\displaystyle \dotprod{ \vec{x}}{(A\,\vec{x})}\\ &&\phantom{xxx}\color{blue}{\text{bilinearity inner product and linearity }A}\\\end{array}$

With this, the expressions in the theorem for $q(\alpha^{-1}(\vec{x}))$ have been derived. Next, polarization shows that, for vectors $\vec{x}$ and $\vec{y}$ of $\mathbb{R}^n$ , $f( \alpha^{-1}(\vec{x}),\alpha^{-1}(\vec{y})) = \dotprod{ \vec{x}}{(A\,\vec{y})}$ Substituting $\vec{x} =\alpha(\vec{u})$ and $\vec{y} =\alpha(\vec{v})$ , we find $f(\vec{u},\vec{v}) =\dotprod{\alpha( \vec{u}) }{( A\,\alpha( \vec{v}))}$ .

2. The matrices $A$ and $B$ are both determined by values of $f$ . This leads to the following relation between $A$ and $B$ :

$\begin{array}{rcl} \dotprod{\beta(\vec{x})}{(B\, \beta(\vec{y}))} &=& f(\vec{x},\vec{y})\\ &&\phantom{xx}\color{blue}{\text{definition of }B}\\&=&\dotprod{\alpha(\vec{x})}{(A\, \alpha(\vec{y}))} \\ &&\phantom{xx}\color{blue}{\text{definition of }A}\\ &=&\dotprod{(\alpha\beta^{-1}(\beta(\vec{x})))}{(A\, (\alpha\beta^{-1})(\beta(\vec{y})))}\\&&\phantom{xx}\color{blue}{\text{rewritten}}\\&=&\dotprod{({}_\alpha I_\beta(\beta(\vec{x})))}{(A\, {}_\alpha I_\beta(\beta(\vec{y})))}\\&&\phantom{xx}\color{blue}{\text{notation for transition matrix}}\\&=&\dotprod{\beta(\vec{x})}{({}_\alpha I_\beta^\top\,A\, \;{}_\alpha I_\beta(\beta(\vec{y})))}\\&&\phantom{xx}\color{blue}{\text{definition transposed }}\\\end{array}$ Because $\beta$ is surjective, both $\beta(\vec{x})$ and $\beta(\vec{y})$ run over all vectors of $\mathbb{R}^n$ . Consequently, the matrices $B$ and ${}_\alpha I_\beta^\top\,A\, \;{}_\alpha I_\beta$ are identical. This proves that $B = {}_\alpha I_\beta^\top\,A\, \;{}_\alpha I_\beta$ .

3. By theorem Diagonalizibilty of symmetric matrices, there is an orthogonal matrix $Q$ such that $B = Q^\top \,A Q$ is a diagonal matrix. Let $\beta=\basis{\alpha^{-1}(Q\,\vec{e}_1),\ldots,\alpha^{-1}(Q\,\vec{e}_n)}$

Then, for all $i$ , $\alpha^{-1}(Q\,\vec{e}_i)=\beta^{-1}(\vec{e}_i)=\alpha^{-1}({}_\alpha I_\beta\,\vec{e}_i)$

So $\beta$ is a basis for which $Q={}_\alpha I_\beta$ . From statement 2 it follows that $B$ is the matrix of $q$ with respect to $\beta$ . Because $B$ is a diagonal matrix, the $(i,j)$ -entries $b_{ij}$ of $B$ satisfy $b_{ij}=0$ if $i\ne j$ , so

$q(\beta^{-1}(\vec{x})) = \sum_{i,j=1}^n b_{ij}x_i x_j= \sum_{i,j=1}^n b_{ii}x_i^2$

Because $Q$ is orthogonal, $B = Q^\top \,A Q =Q^{-1} \,A Q$ is conjugate to $A$ . In particular, the diagonal entries $b_i = b_{ii}$ of $B$ are the eigenvalues of $A$ . This settles the proof of 3.

4. Let $\vec{v}$ be a vector of $V$ distinct from the zero vector. The value of $f(\vec{v},\vec{v})$ is equal to $q(\vec{v})=\sum_{i,j=1}^n b_{i}x_i^2$ for $\rv{x_1,\ldots,x_n} = \alpha(\vec{v})$ . This value is positive for all nonzero vectors $\vec{v}$ of $V$ if and only if all $b_i$ are positive.

By scaling the vectors of the basis $\beta$ , we can even achieve that each diagonal entry of $A$ is equal to one of $0$ , $1$ , $-1$ . To this end, we scale the $i$ -th element of $\beta$ by $\frac{1}{\sqrt{|b_i|}}$ if $b_i\ne0$ . Here, the transformation matrix is no longer orthogonal, so the distances in $V$ are no longer preserved.

Consider once more the quadratic form $q$ on $\mathbb{R}^2$ determined by $q(\rv{x,y}) = 2x^2-4xy+5y^2$ with corresponding bilinear form

$\begin{array}{rcl}f(\rv{x,y},\rv{u,v}) &=& 2xu-2xv-2uy+5yv\end{array}$

The matrix $A$ of $f$ satisfies

$\begin{array}{rcl} \matrix{x&y}\, A\, \matrix{u\\ v} &=& f(\rv{x,y},\rv{u,v})\\&=&2xu-2xv-2uy+5yv\end{array}$

From this we deduce:

$\begin{array}{rcl} a_{11}&=&\matrix{1&0}\, A\, \matrix{1\\ 0}= 2 \\&&\phantom{xx}\color{blue}{x=1,y=0,u=1,v=0}\\ a_{12}&=&\matrix{1&0}\, A\, \matrix{0\\ 1}= -2 \\ &&\phantom{xx}\color{blue}{x=1,y=0,u=0,v=1}\\ a_{22}&=&\matrix{0&1}\, A\, \matrix{0\\ 1} = 5\\ &&\phantom{xx}\color{blue}{x=0,y=1,u=0,v=1}\\ \end{array}$
The $(2,1)$ -entry $a_{21}$ of $A$ is equal to $a_{12}=-2$ because $A$ is symmetric. So

$A = \matrix{2 & -2\\ -2&5}$

We bring $q$ into diagonal form by calculating an orthonormal basis $\beta$ of eigenvectors of $A$ . The eigenvalues are solutions of the characteristic equation:

$\begin{array}{rcl}p_A(x) &=& x^2-7x+6\\ &&\phantom{xx}\color{blue}{\text{the characteristic polynomial}}\\\rv{\lambda_1,\lambda_2} &=& \rv{1,6}\\ &&\phantom{xx}\color{blue}{\text{solutions of the equation }p_A(x) = 0}\\\beta &=& \basis{\frac{1}{\sqrt{5}}\rv{2,1},\frac{1}{\sqrt{5}}\rv{1,-2}}\\&&\phantom{xx}\color{blue}{\text{corresponding normalized eigenvectors}}\\{}_\varepsilon I_\beta&=&\frac{1}{\sqrt{5} }\matrix{2&1\\1&-2}\\ &&\phantom{xx}\color{blue}{\text{corresponding transformation matrix}}\\ q(\beta^{-1}\rv{x,y}) &=&\matrix{x&y}\,{}_\varepsilon I_\beta^\top\, A\,{}_\varepsilon I_\beta \,\matrix{x\\ y}\\&&\phantom{xx}\color{blue}{\text{quadratic form with respect to }\beta}\\&=&\matrix{x&y}\,\matrix{1&0\\ 0&6}\,\matrix{x\\ y}\\&&\phantom{xx}\color{blue}{\text{diagonal matrix substituted}}\\&=& x^2+6y^2\\&&\phantom{xx}\color{blue}{\text{matrix products worked out}}\end{array}$ Thus we have found the diagonal form $q(\beta^{-1}\rv{x,y})=x^2+6y^2$ for $q$ . By replacing $\rv{x,y}$ by $\beta(\rv{x,y})$ we find an expression of $q(x,y)$ as a linear combination of two squares: $q(x,y) = \frac15(2x+y)^2+\frac65(x-2y)^2$ In particular, $q$ turns out to be positive definite.

The diagonal form is immediately clear once the eigenvalues of $A$ are known: these are the coefficients of the squares of the coordinates in $q(\beta^{-1}\rv{x,y})$ . The majority of the calculation thus consists of finding an orthonormal basis on which the diagonal form is assumed.

The coordinate transformation ${}_\varepsilon I_\beta$ made all mixed products (that is, products of two different variables) disappear! We can already write down the diagonal form of $q$ once the eigenvalues of $A$ are determined.

Often we will apply the theorem with $V = \mathbb{R}^n$ and $\alpha = \varepsilon$ , the standard basis, so that $\alpha$ is orthonormal. If $\alpha$ is orthonormal, then the transition matrix ${}_\alpha I_\beta$ is orthogonal if and only if $\beta$ is orthonormal. We recall from Orthogonality criteria for matrices and Transition matrices and orthonormal bases, that this means that ${}_\alpha I_\beta^{-1}= {{}_\alpha I_\beta}^\top ={}_\beta I_\alpha$

Complex caseThe theorem also holds for complex vector spaces.

If all of the eigenvalues of $A$ are nonnegative, then $q$ is positive semi-definite, which means that $q(\vec{x})\ge0$ for all $\vec{x}$ in $V$ .

The collection of vectors at which a quadratic form assumes a fixed chosen value, is called a quadric. It is the set of solutions of a quadratic polynomial equation with several unknowns. In general, the equation of a quadric also involves linear terms in addition to a quadratic form. Later we will go into this further.

Let $q:\mathbb{R}^3\to \mathbb{R}$ be the quadratic form defined by
$\begin{array}{rcl}q(x,y,z) &=& \displaystyle -3 x^2+4 x y-4 x z-4 y^2+4 y z-3 z^2\end{array}$ What is the matrix $A$ of $q$ ?

$A=$ $\matrix{-3 & 2 & -2 \\ 2 & -4 & 2 \\ -2 & 2 & -3 \\ }$

The matrix $A$ is determined by
$\begin{array}{rcl}q(x,y,z) &=& f(\rv{x,y,z},\rv{x,y,z})\\ &&\phantom{xxxxwwwwwwwxxxx}\color{blue}{f\text{ is the bilinear form of }q}\\ &=&\dotprod{\rv{x,y,z}}{\left(A\, \rv{x, y, z} \right)}\\ &&\phantom{xxxxwwwwwwwxxxx}\color{blue}{\text{definition of }A}\\ &=& {\matrix{x&y&z}}\,A\, \matrix{x\\ y\\ z}\\ &&\phantom{xxxxwwwwwwwxxxx}\color{blue}{\text{inner product rewritten as matrix product}}\\ &=&a_{11}x^2+(a_{12}+a_{21})xy+(a_{13}+a_{31})xz+a_{22}y^2+(a_{23}+a_{32})yz+a_{33}z^2\\ &&\phantom{xxxxwwwwwwwxxxx}\color{blue}{\text{matrix product worked out}}\\ &=& a_{11}x^2+2a_{12}xy+2a_{13}xz+a_{22}y^2+2a_{23}yz+a_{33}z^2\\ &&\phantom{xxxxwwwwwwwxxxx}\color{blue}{A\text{ is symmetric}} \end{array}$ Comparison with the function rule $q(x,y,z) =-3 x^2+4 x y-4 x z-4 y^2+4 y z-3 z^2$ gives
$\begin{array}{rclcr} a_{11}&=&\text{coefficient of } x^2 &=& -3 \\ a_{12}&=&\frac12(\text{coefficient of } x y) &=& 2 \\ a_{13}&=&\frac12(\text{coefficient of } x z )&=& -2 \\ a_{22}&=&\text{coefficient of } y^2 &=& -4 \\ a_{23}&=&\frac12(\text{coefficient of } y z) &=& 2 \\ a_{33}&=&\text{coefficient of } z^2 &=& -3 \end{array}$ The remaining elements of $A$ now follow from the fact that $A$ is symmetric. The conclusion is
$\begin{array}{rcl} A &=& \matrix{-3 & 2 & -2 \\ 2 & -4 & 2 \\ -2 & 2 & -3 \\ }\end{array}$

New example