Contents
raw book

A matrix \(\mathbf{A}\) transforms a vector \(\mathbf{x}\) into another vector \(\mathbf{Ax}\). In general, \(\mathbf{Ax}\) points in a different direction than \(\mathbf{x}\), since some sort of rotation may be part of the transformation.

However, special vectors called eigenvectors keep their direction under a transformation with \(\mathbf{A}\) (i.e. they remain invariant in direction). In this case, \(\mathbf{Ax}\) and \(\mathbf{x}\) are parallel, differing only in a constant scale factor (only stretched, compressed, or flipped), which we call the associated eigenvalue.

Definition

For a square matrix \(\mathbf{A}\), a to be found nonzero vector \(\mathbf{x}\) and a complex or real scalar \(\lambda\) are eigenvectors and associated eigenvalues, iff they satisfy

\[\mathbf{Ax}=\lambda\mathbf{x}\]

There are infinetely many solutions, since \(\frac{\lambda}{c}\) with \(c\mathbf{x}\) for any \(c\neq 0\) also satisfies this equation. Consequently, eigenvectors are assumed to be normalized, i.e., satisfy the constraint \(\mathbf{x}^T\mathbf{x}=1\).

The original equation can be re-arranged a bit:

\[\begin{array}{rrl} &\mathbf{Ax}&=\lambda\mathbf{x}\\ \Leftrightarrow&\mathbf{Ax} - \lambda\mathbf{x}&=\mathbf{0}\\ \Leftrightarrow&(\mathbf{A} - \lambda\mathbf{I})\mathbf{x}&=\mathbf{0} \end{array}\]

This is a homogeneous linear system in \(\mathbf{x}\). It always has the trivial solution \(\mathbf{x}=\mathbf{0}\), but we are explicitly interested in nonzero solutions. A nonzero solution exists if and only if the matrix \(\mathbf{A}-\lambda\mathbf{I}\) is singular, i.e. not invertible. For square matrices, singularity is equivalent to a zero determinant. Therefore, eigenvalues \(\lambda\) are precisely those scalars for which

\[\det(\mathbf{A}-\lambda\mathbf{I})=0.\]

Once an eigenvalue \(\lambda\) is known, eigenvectors are obtained by solving the corresponding linear system

\[(\mathbf{A}-\lambda\mathbf{I})\mathbf{x}=\mathbf{0}\]

and then choosing any nonzero solution and (optionally) normalizing it.

The Characteristic Polynomial

For \(\mathbf{A}\in\mathbb{C}^{n\times n}\), the expression \(\det(\mathbf{A}-\lambda\mathbf{I})\) is a polynomial in \(\lambda\) of degree \(n\). This polynomial is called the characteristic polynomial of \(\mathbf{A}\):

\[p_{\mathbf{A}}(\lambda):=\det(\mathbf{A}-\lambda\mathbf{I}).\]

The equation \(p_{\mathbf{A}}(\lambda)=0\) is called the characteristic equation. Its roots \(\lambda_1,\ldots,\lambda_n\) (counted with algebraic multiplicity) are the eigenvalues of \(\mathbf{A}\).

To see directly why a polynomial appears, it helps to look at an explicit example and observe what changes when we subtract \(\lambda\mathbf{I}\): only the diagonal entries are shifted by \(-\lambda\). When you compute the determinant, you multiply and add these entries in a structured way; as soon as \(\lambda\) appears on the diagonal, the determinant becomes an expression containing powers of \(\lambda\), i.e. a polynomial.

A concrete \(2\times 2\) example

Let

\[\mathbf{A}=\begin{pmatrix}a&b\\c&d\end{pmatrix}.\]

Then

\[\mathbf{A}-\lambda\mathbf{I}= \begin{pmatrix}a-\lambda&b\\c&d-\lambda\end{pmatrix}\]

and the determinant expands to

\[\det(\mathbf{A}-\lambda\mathbf{I}) =\det\begin{pmatrix}a-\lambda&b\\c&d-\lambda\end{pmatrix} =(a-\lambda)(d-\lambda)-bc.\]

Multiplying out gives

\[(a-\lambda)(d-\lambda)-bc =\lambda^2-(a+d)\lambda+(ad-bc).\]

So the characteristic polynomial is

\[p_{\mathbf{A}}(\lambda)=\lambda^2-\mathrm{Tr}(\mathbf{A})\,\lambda+\det(\mathbf{A}).\]

Its roots are the eigenvalues. For higher-dimensional matrices the determinant expansion is more involved, but the same mechanism applies: subtracting \(\lambda\) on the diagonal forces the determinant to become a polynomial in \(\lambda\) whose roots are the eigenvalues.

How to compute eigenvectors once \(\lambda\) is known

For each eigenvalue \(\lambda\), form \(\mathbf{A}-\lambda\mathbf{I}\) and solve

\[(\mathbf{A}-\lambda\mathbf{I})\mathbf{x}=\mathbf{0}.\]

The solution set is the null space (kernel) \(\mathcal{N}(\mathbf{A}-\lambda\mathbf{I})\), called the eigenspace associated with \(\lambda\). Any nonzero vector in this eigenspace is an eigenvector. If the eigenspace has dimension greater than \(1\), then \(\lambda\) has multiple linearly independent eigenvectors; choosing an orthonormal basis of the eigenspace is often convenient.

Properties

Orthogonal Matrices

An orthogonal matrix is a real square matrix \(\mathbf{A}\) such that

\[\mathbf{A}^T=\mathbf{A}^{-1}.\]

Equivalently,

\[\mathbf{A}\mathbf{A}^T=\mathbf{I}\quad\text{and}\quad \mathbf{A}^T\mathbf{A}=\mathbf{I}.\]

This means the rows (and columns) of \(\mathbf{A}\) form an orthonormal set:

Geometrically, orthogonal matrices represent length- and angle-preserving transformations (rotations and reflections). In particular, \(\|\mathbf{Ax}\|_2=\|\mathbf{x}\|_2\) for all \(\mathbf{x}\).

Rotation Matrices and Complex Eigenvalues

A particularly beautiful example is the two-dimensional rotation matrix

\[ \mathbf{R}(\theta)= \begin{pmatrix} \cos\theta & -\sin\theta\\ \sin\theta & \cos\theta \end{pmatrix}. \]

This matrix rotates every vector in \(\mathbb{R}^2\) by the angle \(\theta\) about the origin. Since rotations preserve lengths and angles, \(\mathbf{R}(\theta)\) is orthogonal and satisfies

\[ \mathbf{R}(\theta)^T\mathbf{R}(\theta)=\mathbf{I}. \]

To find its eigenvalues, compute the characteristic polynomial:

\[ \det(\mathbf{R}(\theta)-\lambda\mathbf{I}) = \det\begin{pmatrix} \cos\theta-\lambda & -\sin\theta\\ \sin\theta & \cos\theta-\lambda \end{pmatrix}. \]

Expanding the determinant gives

\[ (\cos\theta-\lambda)^2+\sin^2\theta = 0, \]

hence

\[ \lambda^2 - 2\cos\theta\,\lambda + 1 = 0. \]

The two eigenvalues are therefore

\[ \lambda_{1,2}=\cos\theta \pm i\sin\theta = e^{\pm i\theta}. \]

This shows the eigenvalues of a planar rotation are themselves points on the unit circle in the complex plane. In other words, the rotation reappears inside its own spectrum.

To see why this is so natural, identify a vector \(\begin{pmatrix}x\\y\end{pmatrix}\in\mathbb{R}^2\) with the complex number \(z=x+iy\). Then applying the matrix \(\mathbf{R}(\theta)\) is exactly the same as multiplying by \(e^{i\theta}\):

\[ \mathbf{R}(\theta) \begin{pmatrix} x\\y \end{pmatrix} = \begin{pmatrix} x\cos\theta-y\sin\theta\\ x\sin\theta+y\cos\theta \end{pmatrix} \quad\longleftrightarrow\quad e^{i\theta}(x+iy). \]

So a planar rotation can be understood in two equivalent ways:

From this viewpoint, the complex eigenvalues are exciting because they reveal a more natural coordinate system for the transformation. What looks like a coupled two-dimensional motion in real coordinates becomes a simple multiplication in complex coordinates.

Over \(\mathbb{C}\), corresponding eigenvectors do exist. For example, for \(\lambda_1=\cos\theta+i\sin\theta\), one may choose

\[ \mathbf{v}_1= \begin{pmatrix} 1\\ -i \end{pmatrix}, \qquad \text{and for } \lambda_2=\cos\theta-i\sin\theta, \quad \mathbf{v}_2= \begin{pmatrix} 1\\ i \end{pmatrix}. \]

Thus the eigendecomposition does not merely produce numbers: it uncovers a change in representation. A real rotation matrix becomes diagonal once the plane is viewed through the complex-number model, where rotation is simply multiplication by \(e^{i\theta}\).

Eigenvalue Decomposition (EVD)

If \(\mathbf{A}\in\mathbb{R}^{n\times n}\) has \(n\) linearly independent eigenvectors \(\mathbf{q}_i\) (for \(i=1,\ldots,n\)) and is therefore diagonalisable, then \(\mathbf{A}\) can be factorized as

\[\mathbf{A} = \mathbf{Q}\mathbf{\Lambda}\mathbf{Q}^{-1}\]

where \(\mathbf{Q}\) is a \(n\times n\) matrix whose ith column is the eigenvector \(\mathbf{q}_i\) of \(\mathbf{A}\) and \(\mathbf{\Lambda}\) is the diagonal matrix whose diagonal elements are the corresponding eigenvalues \({\Lambda}_{ii}=\lambda_i\). The decomposition can directly be derived from the initial statement about eigenvalues and eigenvectors:

\[\begin{array}{rrl} &\mathbf{A} \mathbf{q}_i &= \lambda_i \mathbf{q}_i \quad (i=1,\ldots,n)\\ \Leftrightarrow&\mathbf{A} \mathbf{Q} &= \mathbf{Q} \mathbf{\Lambda} \\ \Leftrightarrow&\mathbf{A} &= \mathbf{Q}\mathbf{\Lambda}\mathbf{Q}^{-1} . \end{array}\]

Spectral theorem (important special case): If \(\mathbf{A}\) is real symmetric, it is always diagonalisable with an orthogonal eigenvector matrix. Hence

\[\mathbf{A}=\mathbf{Q}\mathbf{\Lambda}\mathbf{Q}^T,\qquad \mathbf{Q}^T\mathbf{Q}=\mathbf{I}.\]

Note, however, that not every real matrix is diagonalisable over \(\mathbb{R}\). For example, a nontrivial two-dimensional rotation matrix has no real eigenvectors and therefore only becomes diagonalisable when the field is extended to \(\mathbb{C}\).

Applications of Eigenvalues and Eigenvectors