Contents
raw book

A matrix \(\mathbf{A}\) transforms a vector \(\mathbf{x}\) into another vector \(\mathbf{Ax}\). In general, \(\mathbf{Ax}\) points in a different direction than \(\mathbf{x}\), since some sort of rotation may be part of the transformation.

However, special vectors called eigenvectors keep their direction under a transformation with \(\mathbf{A}\) (i.e. they remain invariant in direction). In this case, \(\mathbf{Ax}\) and \(\mathbf{x}\) are parallel, differing only in a constant scale factor (only stretched, compressed, or flipped), which we call the associated eigenvalue.

Definition

For a square matrix \(\mathbf{A}\), a to be found nonzero vector \(\mathbf{x}\) and a complex or real scalar \(\lambda\) are eigenvectors and associated eigenvalues, iff they satisfy

\[\mathbf{Ax}=\lambda\mathbf{x}\]

There are infinetely many solutions, since \(\frac{\lambda}{c}\) with \(c\mathbf{x}\) for any \(c\neq 0\) also satisfies this equation. Consequently, eigenvectors are assumed to be normalized, i.e., satisfy the constraint \(\mathbf{x}^T\mathbf{x}=1\).

The original equation can be re-arranged a bit:

\[\begin{array}{rrl} &\mathbf{Ax}&=\lambda\mathbf{x}\\ \Leftrightarrow&\mathbf{Ax} - \lambda\mathbf{x}&=\mathbf{0}\\ \Leftrightarrow&(\mathbf{A} - \lambda\mathbf{I})\mathbf{x}&=\mathbf{0} \end{array}\]

This is a homogeneous linear system in \(\mathbf{x}\). It always has the trivial solution \(\mathbf{x}=\mathbf{0}\), but we are explicitly interested in nonzero solutions. A nonzero solution exists if and only if the matrix \(\mathbf{A}-\lambda\mathbf{I}\) is singular, i.e. not invertible. For square matrices, singularity is equivalent to a zero determinant. Therefore, eigenvalues \(\lambda\) are precisely those scalars for which

\[\det(\mathbf{A}-\lambda\mathbf{I})=0.\]

Once an eigenvalue \(\lambda\) is known, eigenvectors are obtained by solving the corresponding linear system

\[(\mathbf{A}-\lambda\mathbf{I})\mathbf{x}=\mathbf{0}\]

and then choosing any nonzero solution and (optionally) normalizing it.

The Characteristic Polynomial

For \(\mathbf{A}\in\mathbb{C}^{n\times n}\), the expression \(\det(\mathbf{A}-\lambda\mathbf{I})\) is a polynomial in \(\lambda\) of degree \(n\). This polynomial is called the characteristic polynomial of \(\mathbf{A}\):

\[p_{\mathbf{A}}(\lambda):=\det(\mathbf{A}-\lambda\mathbf{I}).\]

The equation \(p_{\mathbf{A}}(\lambda)=0\) is called the characteristic equation. Its roots \(\lambda_1,\ldots,\lambda_n\) (counted with algebraic multiplicity) are the eigenvalues of \(\mathbf{A}\).

To see directly why a polynomial appears, it helps to look at an explicit example and observe what changes when we subtract \(\lambda\mathbf{I}\): only the diagonal entries are shifted by \(-\lambda\). When you compute the determinant, you multiply and add these entries in a structured way; as soon as \(\lambda\) appears on the diagonal, the determinant becomes an expression containing powers of \(\lambda\), i.e. a polynomial.

A concrete \(2\times 2\) example

Let

\[\mathbf{A}=\begin{pmatrix}a&b\\c&d\end{pmatrix}.\]

Then

\[\mathbf{A}-\lambda\mathbf{I}= \begin{pmatrix}a-\lambda&b\\c&d-\lambda\end{pmatrix}\]

and the determinant expands to

\[\det(\mathbf{A}-\lambda\mathbf{I}) =\det\begin{pmatrix}a-\lambda&b\\c&d-\lambda\end{pmatrix} =(a-\lambda)(d-\lambda)-bc.\]

Multiplying out gives

\[(a-\lambda)(d-\lambda)-bc =\lambda^2-(a+d)\lambda+(ad-bc).\]

So the characteristic polynomial is

\[p_{\mathbf{A}}(\lambda)=\lambda^2-\mathrm{Tr}(\mathbf{A})\,\lambda+\det(\mathbf{A}).\]

Its roots are the eigenvalues. For higher-dimensional matrices the determinant expansion is more involved, but the same mechanism applies: subtracting \(\lambda\) on the diagonal forces the determinant to become a polynomial in \(\lambda\) whose roots are the eigenvalues.

How to compute eigenvectors once \(\lambda\) is known

For each eigenvalue \(\lambda\), form \(\mathbf{A}-\lambda\mathbf{I}\) and solve

\[(\mathbf{A}-\lambda\mathbf{I})\mathbf{x}=\mathbf{0}.\]

The solution set is the null space (kernel) \(\mathcal{N}(\mathbf{A}-\lambda\mathbf{I})\), called the eigenspace associated with \(\lambda\). Any nonzero vector in this eigenspace is an eigenvector. If the eigenspace has dimension greater than \(1\), then \(\lambda\) has multiple linearly independent eigenvectors; choosing an orthonormal basis of the eigenspace is often convenient.

Properties

Orthogonal Matrices

An orthogonal matrix is a real square matrix \(\mathbf{A}\) such that

\[\mathbf{A}^T=\mathbf{A}^{-1}.\]

Equivalently,

\[\mathbf{A}\mathbf{A}^T=\mathbf{I}\quad\text{and}\quad \mathbf{A}^T\mathbf{A}=\mathbf{I}.\]

This means the rows (and columns) of \(\mathbf{A}\) form an orthonormal set:

Geometrically, orthogonal matrices represent length- and angle-preserving transformations (rotations and reflections). In particular, \(\|\mathbf{Ax}\|_2=\|\mathbf{x}\|_2\) for all \(\mathbf{x}\).

Eigenvalue Decomposition (EVD)

If \(\mathbf{A}\in\mathbb{R}^{n\times n}\) has \(n\) linearly independent eigenvectors \(\mathbf{q}_i\) (for \(i=1,\ldots,n\)) and is therefore diagonalisable, then \(\mathbf{A}\) can be factorized as

\[\mathbf{A} = \mathbf{Q}\mathbf{\Lambda}\mathbf{Q}^{-1}\]

where \(\mathbf{Q}\) is a \(n\times n\) matrix whose ith column is the eigenvector \(\mathbf{q}_i\) of \(\mathbf{A}\) and \(\mathbf{\Lambda}\) is the diagonal matrix whose diagonal elements are the corresponding eigenvalues \({\Lambda}_{ii}=\lambda_i\). The decomposition can directly be derived from the initial statement about eigenvalues and eigenvectors:

\[\begin{array}{rrl} &\mathbf{A} \mathbf{q}_i &= \lambda_i \mathbf{q}_i \quad (i=1,\ldots,n)\\ \Leftrightarrow&\mathbf{A} \mathbf{Q} &= \mathbf{Q} \mathbf{\Lambda} \\ \Leftrightarrow&\mathbf{A} &= \mathbf{Q}\mathbf{\Lambda}\mathbf{Q}^{-1} . \end{array}\]

Spectral theorem (important special case): If \(\mathbf{A}\) is real symmetric, it is always diagonalisable with an orthogonal eigenvector matrix. Hence

\[\mathbf{A}=\mathbf{Q}\mathbf{\Lambda}\mathbf{Q}^T,\qquad \mathbf{Q}^T\mathbf{Q}=\mathbf{I}.\]

Applications of Eigenvalues and Eigenvectors