A matrix \(\mathbf{A}\) transforms a vector \(\mathbf{x}\) into another vector \(\mathbf{Ax}\). In general, \(\mathbf{Ax}\) points in a different direction than \(\mathbf{x}\), since some sort of rotation may be part of the transformation.
However, special vectors called eigenvectors keep their direction under a transformation with \(\mathbf{A}\) (i.e. they remain invariant in direction). In this case, \(\mathbf{Ax}\) and \(\mathbf{x}\) are parallel, differing only in a constant scale factor (only stretched, compressed, or flipped), which we call the associated eigenvalue.
Definition
For a square matrix \(\mathbf{A}\), a to be found nonzero vector \(\mathbf{x}\) and a complex or real scalar \(\lambda\) are eigenvectors and associated eigenvalues, iff they satisfy
\[\mathbf{Ax}=\lambda\mathbf{x}\]
There are infinetely many solutions, since \(\frac{\lambda}{c}\) with \(c\mathbf{x}\) for any \(c\neq 0\) also satisfies this equation. Consequently, eigenvectors are assumed to be normalized, i.e., satisfy the constraint \(\mathbf{x}^T\mathbf{x}=1\).
The original equation can be re-arranged a bit:
\[\begin{array}{rrl} &\mathbf{Ax}&=\lambda\mathbf{x}\\ \Leftrightarrow&\mathbf{Ax} - \lambda\mathbf{x}&=\mathbf{0}\\ \Leftrightarrow&(\mathbf{A} - \lambda\mathbf{I})\mathbf{x}&=\mathbf{0} \end{array}\]
This is a homogeneous linear system in \(\mathbf{x}\). It always has the trivial solution \(\mathbf{x}=\mathbf{0}\), but we are explicitly interested in nonzero solutions. A nonzero solution exists if and only if the matrix \(\mathbf{A}-\lambda\mathbf{I}\) is singular, i.e. not invertible. For square matrices, singularity is equivalent to a zero determinant. Therefore, eigenvalues \(\lambda\) are precisely those scalars for which
\[\det(\mathbf{A}-\lambda\mathbf{I})=0.\]
Once an eigenvalue \(\lambda\) is known, eigenvectors are obtained by solving the corresponding linear system
\[(\mathbf{A}-\lambda\mathbf{I})\mathbf{x}=\mathbf{0}\]
and then choosing any nonzero solution and (optionally) normalizing it.
The Characteristic Polynomial
For \(\mathbf{A}\in\mathbb{C}^{n\times n}\), the expression \(\det(\mathbf{A}-\lambda\mathbf{I})\) is a polynomial in \(\lambda\) of degree \(n\). This polynomial is called the characteristic polynomial of \(\mathbf{A}\):
\[p_{\mathbf{A}}(\lambda):=\det(\mathbf{A}-\lambda\mathbf{I}).\]
The equation \(p_{\mathbf{A}}(\lambda)=0\) is called the characteristic equation. Its roots \(\lambda_1,\ldots,\lambda_n\) (counted with algebraic multiplicity) are the eigenvalues of \(\mathbf{A}\).
To see directly why a polynomial appears, it helps to look at an explicit example and observe what changes when we subtract \(\lambda\mathbf{I}\): only the diagonal entries are shifted by \(-\lambda\). When you compute the determinant, you multiply and add these entries in a structured way; as soon as \(\lambda\) appears on the diagonal, the determinant becomes an expression containing powers of \(\lambda\), i.e. a polynomial.
A concrete \(2\times 2\) example
Let
\[\mathbf{A}=\begin{pmatrix}a&b\\c&d\end{pmatrix}.\]
Then
\[\mathbf{A}-\lambda\mathbf{I}= \begin{pmatrix}a-\lambda&b\\c&d-\lambda\end{pmatrix}\]
and the determinant expands to
\[\det(\mathbf{A}-\lambda\mathbf{I}) =\det\begin{pmatrix}a-\lambda&b\\c&d-\lambda\end{pmatrix} =(a-\lambda)(d-\lambda)-bc.\]
Multiplying out gives
\[(a-\lambda)(d-\lambda)-bc =\lambda^2-(a+d)\lambda+(ad-bc).\]
So the characteristic polynomial is
\[p_{\mathbf{A}}(\lambda)=\lambda^2-\mathrm{Tr}(\mathbf{A})\,\lambda+\det(\mathbf{A}).\]
Its roots are the eigenvalues. For higher-dimensional matrices the determinant expansion is more involved, but the same mechanism applies: subtracting \(\lambda\) on the diagonal forces the determinant to become a polynomial in \(\lambda\) whose roots are the eigenvalues.
How to compute eigenvectors once \(\lambda\) is known
For each eigenvalue \(\lambda\), form \(\mathbf{A}-\lambda\mathbf{I}\) and solve
\[(\mathbf{A}-\lambda\mathbf{I})\mathbf{x}=\mathbf{0}.\]
The solution set is the null space (kernel) \(\mathcal{N}(\mathbf{A}-\lambda\mathbf{I})\), called the eigenspace associated with \(\lambda\). Any nonzero vector in this eigenspace is an eigenvector. If the eigenspace has dimension greater than \(1\), then \(\lambda\) has multiple linearly independent eigenvectors; choosing an orthonormal basis of the eigenspace is often convenient.
Properties
- Eigenvalues are roots of the characteristic polynomial: \(\lambda\) is an eigenvalue of \(\mathbf{A}\) iff \(\det(\mathbf{A}-\lambda\mathbf{I})=0\), i.e. iff \(p_{\mathbf{A}}(\lambda)=0\).
- Symmetric matrices: If \(\mathbf{A}\) is real symmetric, all eigenvalues are real.
- Trace and determinant (with multiplicities): If \(\lambda_1,\ldots,\lambda_n\) are the eigenvalues counted with algebraic multiplicity, then \[\mathrm{Tr}(\mathbf{A})=\sum_{i=1}^n\lambda_i,\qquad \det(\mathbf{A})=\prod_{i=1}^n\lambda_i.\]
- If \(\mathbf{A}\) has \(n\) distinct (and therefore linearly independent) eigenvectors, these eigenvectors form a basis of \(\mathbb{R}^n\) and \(\mathbf{A}\) is diagonalisable
- Diagonal and triangular matrices: If \(\mathbf{A}\) is diagonal (or, more generally, triangular), its eigenvalues are exactly its diagonal entries.
- Scaling: The scaled matrix \(c\mathbf{A}\) (scalar \(c\)) has eigenvalues \(c\lambda_i\) with the same eigenvectors \(\mathbf{x}_i\).
Proof: \(\mathbf{Ax}_i=\lambda_i\mathbf{x}_i \Rightarrow (c\mathbf{A})\mathbf{x}_i=(c\lambda_i)\mathbf{x}_i.\) - Shifting by the identity: \(\mathbf{A}+c\mathbf{I}\) has eigenvalues \(\lambda_i+c\) with the same eigenvectors.
Proof: \((\mathbf{A}+c\mathbf{I})\mathbf{x}_i=\mathbf{A}\mathbf{x}_i+c\mathbf{x}_i=(\lambda_i+c)\mathbf{x}_i.\) - Powers: \(\mathbf{A}^t\) has eigenvalues \(\lambda_i^t\) and eigenvectors \(\mathbf{x}_i\) (integer \(t\ge 0\)). More generally, whenever \(f(\mathbf{A})\) is defined via a polynomial \(f\), the eigenvalues transform as \(f(\lambda_i)\) on the same eigenvectors.
- Inverse: If \(\mathbf{A}^{-1}\) exists, it has eigenvalues \(\frac{1}{\lambda_i}\) with the same eigenvectors.
Proof: \[\begin{array}{rrl} &\mathbf{A}\mathbf{x}_i&=\lambda_i\mathbf{x}_i\\ \Leftrightarrow&\mathbf{A}^{-1}\mathbf{A}\mathbf{x}_i&=\mathbf{A}^{-1}\lambda_i\mathbf{x}_i\\ \Leftrightarrow&\mathbf{x}_i&=\lambda_i\mathbf{A}^{-1}\mathbf{x}_i\\ \Leftrightarrow&\mathbf{A}^{-1}\mathbf{x}_i&=\frac{1}{\lambda_i}\mathbf{x}_i\quad\square \end{array}\] - Transpose: \(\mathbf{A}\) and \(\mathbf{A}^T\) have the same eigenvalues (over \(\mathbb{C}\)).
- Similarity invariance: If \(\mathbf{B}=\mathbf{S}^{-1}\mathbf{A}\mathbf{S}\) for an invertible \(\mathbf{S}\), then \(\mathbf{A}\) and \(\mathbf{B}\) have the same characteristic polynomial and the same eigenvalues.
- Products \(\mathbf{AB}\) and \(\mathbf{BA}\): The nonzero eigenvalues of \(\mathbf{AB}\) equal the nonzero eigenvalues of \(\mathbf{BA}\) (with multiplicities). In particular, \(\mathrm{Tr}(\mathbf{AB})=\mathrm{Tr}(\mathbf{BA})\).
- Real matrices and complex conjugate pairs: If \(\mathbf{A}\) is real and has a non-real eigenvalue \(\lambda\in\mathbb{C}\setminus\mathbb{R}\), then \(\overline{\lambda}\) is also an eigenvalue (with the same multiplicity).
- Orthogonality of eigenvectors (symmetric case): If \(\mathbf{A}\) is real symmetric, then eigenvectors corresponding to distinct eigenvalues are orthogonal and can be chosen to form an orthonormal basis. In particular, they can be used as axes, e.g. for a plotting a covariance matrix.
- Positive (semi-)definite matrices:
- If \(\mathbf{A}\) is positive definite (\(\mathbf{x}^T\mathbf{A}\mathbf{x}>0\) for all \(\mathbf{x}\neq\mathbf{0}\)), then all eigenvalues satisfy \(\lambda>0\).
- If \(\mathbf{A}\) is positive semidefinite (\(\mathbf{x}^T\mathbf{A}\mathbf{x}\ge 0\) for all \(\mathbf{x}\neq\mathbf{0}\)), then all eigenvalues satisfy \(\lambda\ge 0\).
- Multiplicity (conceptual clarity): An eigenvalue may occur multiple times as a root of \(p_{\mathbf{A}}(\lambda)\); this is its algebraic multiplicity. The dimension of its eigenspace \(\mathcal{N}(\mathbf{A}-\lambda\mathbf{I})\) is its geometric multiplicity. Always \(1\le \text{geometric}\le \text{algebraic}\).
Orthogonal Matrices
An orthogonal matrix is a real square matrix \(\mathbf{A}\) such that
\[\mathbf{A}^T=\mathbf{A}^{-1}.\]
Equivalently,
\[\mathbf{A}\mathbf{A}^T=\mathbf{I}\quad\text{and}\quad \mathbf{A}^T\mathbf{A}=\mathbf{I}.\]
This means the rows (and columns) of \(\mathbf{A}\) form an orthonormal set:
- the inner product of any two different rows is \(0\),
- the inner product of any row with itself is \(1\).
Geometrically, orthogonal matrices represent length- and angle-preserving transformations (rotations and reflections). In particular, \(\|\mathbf{Ax}\|_2=\|\mathbf{x}\|_2\) for all \(\mathbf{x}\).
Rotation Matrices and Complex Eigenvalues
A particularly beautiful example is the two-dimensional rotation matrix
\[ \mathbf{R}(\theta)= \begin{pmatrix} \cos\theta & -\sin\theta\\ \sin\theta & \cos\theta \end{pmatrix}. \]
This matrix rotates every vector in \(\mathbb{R}^2\) by the angle \(\theta\) about the origin. Since rotations preserve lengths and angles, \(\mathbf{R}(\theta)\) is orthogonal and satisfies
\[ \mathbf{R}(\theta)^T\mathbf{R}(\theta)=\mathbf{I}. \]
To find its eigenvalues, compute the characteristic polynomial:
\[ \det(\mathbf{R}(\theta)-\lambda\mathbf{I}) = \det\begin{pmatrix} \cos\theta-\lambda & -\sin\theta\\ \sin\theta & \cos\theta-\lambda \end{pmatrix}. \]
Expanding the determinant gives
\[ (\cos\theta-\lambda)^2+\sin^2\theta = 0, \]
hence
\[ \lambda^2 - 2\cos\theta\,\lambda + 1 = 0. \]
The two eigenvalues are therefore
\[ \lambda_{1,2}=\cos\theta \pm i\sin\theta = e^{\pm i\theta}. \]
This shows the eigenvalues of a planar rotation are themselves points on the unit circle in the complex plane. In other words, the rotation reappears inside its own spectrum.
To see why this is so natural, identify a vector \(\begin{pmatrix}x\\y\end{pmatrix}\in\mathbb{R}^2\) with the complex number \(z=x+iy\). Then applying the matrix \(\mathbf{R}(\theta)\) is exactly the same as multiplying by \(e^{i\theta}\):
\[ \mathbf{R}(\theta) \begin{pmatrix} x\\y \end{pmatrix} = \begin{pmatrix} x\cos\theta-y\sin\theta\\ x\sin\theta+y\cos\theta \end{pmatrix} \quad\longleftrightarrow\quad e^{i\theta}(x+iy). \]
So a planar rotation can be understood in two equivalent ways:
- as a real \(2\times 2\) matrix acting on \(\mathbb{R}^2\), or
- as multiplication by a unit complex number acting on \(\mathbb{C}\).
From this viewpoint, the complex eigenvalues are exciting because they reveal a more natural coordinate system for the transformation. What looks like a coupled two-dimensional motion in real coordinates becomes a simple multiplication in complex coordinates.
Over \(\mathbb{C}\), corresponding eigenvectors do exist. For example, for \(\lambda_1=\cos\theta+i\sin\theta\), one may choose
\[ \mathbf{v}_1= \begin{pmatrix} 1\\ -i \end{pmatrix}, \qquad \text{and for } \lambda_2=\cos\theta-i\sin\theta, \quad \mathbf{v}_2= \begin{pmatrix} 1\\ i \end{pmatrix}. \]
Thus the eigendecomposition does not merely produce numbers: it uncovers a change in representation. A real rotation matrix becomes diagonal once the plane is viewed through the complex-number model, where rotation is simply multiplication by \(e^{i\theta}\).
Eigenvalue Decomposition (EVD)
If \(\mathbf{A}\in\mathbb{R}^{n\times n}\) has \(n\) linearly independent eigenvectors \(\mathbf{q}_i\) (for \(i=1,\ldots,n\)) and is therefore diagonalisable, then \(\mathbf{A}\) can be factorized as
\[\mathbf{A} = \mathbf{Q}\mathbf{\Lambda}\mathbf{Q}^{-1}\]
where \(\mathbf{Q}\) is a \(n\times n\) matrix whose ith column is the eigenvector \(\mathbf{q}_i\) of \(\mathbf{A}\) and \(\mathbf{\Lambda}\) is the diagonal matrix whose diagonal elements are the corresponding eigenvalues \({\Lambda}_{ii}=\lambda_i\). The decomposition can directly be derived from the initial statement about eigenvalues and eigenvectors:
\[\begin{array}{rrl} &\mathbf{A} \mathbf{q}_i &= \lambda_i \mathbf{q}_i \quad (i=1,\ldots,n)\\ \Leftrightarrow&\mathbf{A} \mathbf{Q} &= \mathbf{Q} \mathbf{\Lambda} \\ \Leftrightarrow&\mathbf{A} &= \mathbf{Q}\mathbf{\Lambda}\mathbf{Q}^{-1} . \end{array}\]
Spectral theorem (important special case): If \(\mathbf{A}\) is real symmetric, it is always diagonalisable with an orthogonal eigenvector matrix. Hence
\[\mathbf{A}=\mathbf{Q}\mathbf{\Lambda}\mathbf{Q}^T,\qquad \mathbf{Q}^T\mathbf{Q}=\mathbf{I}.\]
Note, however, that not every real matrix is diagonalisable over \(\mathbb{R}\). For example, a nontrivial two-dimensional rotation matrix has no real eigenvectors and therefore only becomes diagonalisable when the field is extended to \(\mathbb{C}\).
Applications of Eigenvalues and Eigenvectors
- Principal component analysis (PCA)
- Powers of a diagonalizable matrix (fast computation of \(\mathbf{A}^t\))
- Quadratic forms and ellipses/ellipsoids (axes and scaling)
- Stability analysis of linear systems and differential equations
- Graph algorithms (e.g. Laplacians, spectral clustering)