Introduction
Sample variance is a fundamental measure in statistics, capturing how much data points deviate from their mean. Despite its apparent simplicity, computing variance accurately and efficiently, especially in dynamic or streaming contexts, presents challenges due to numerical instability in floating-point arithmetic.
This article derives numerically stable algorithms for computing sample variance in a single pass and for efficiently updating variance when individual data points are replaced. These methods avoid catastrophic cancellation and maintain precision even when the variance is small relative to the magnitude of the data.
Classical Sample Variance
Given data points \( x_1, x_2, \ldots, x_n \), the sample mean \( \bar{x} \) and sample variance \( s^2 \) are defined as: \[ \bar{x} = \frac{1}{n} \sum_{i=1}^n x_i, \quad s^2 = \frac{1}{n-1} \sum_{i=1}^n (x_i - \bar{x})^2 \]
By algebraically expanding, the variance can also be defined as:
\[ s^2 = \frac{1}{n-1} \left( \sum_{i=1}^n x_i^2 - n \bar{x}^2 \right) \]
While this form is mathematically valid, it is numerically unstable for large \( \bar{x} \) and small \( s^2 \) due to catastrophic cancellation. This can lead to large relative errors or even negative variance, which is mathematically impossible.
Welford’s Method
Welford’s method [Welford62] updates the mean and variance incrementally for each incoming data point. It has been found, both theoretically and experimentally, to offer accuracy and numerical stability comparable to the standard two-pass method by avoiding direct subtraction of large numbers.
Let:
- \( M_k \): mean after \( k \) samples,
- \( S_k \): sum of squared deviations after \( k \) samples.
Initialization: \[ M_1 = x_1, \quad S_1 = 0 \]
Recursive Update:
For \( k \geq 2 \), suppose we have \( M_{k-1} \) and \( S_{k-1} \), and we receive \( x_k \). Then: \[ M_k = M_{k-1} + \frac{x_k - M_{k-1}}{k} \]
We aim to update \( S_k \) such that: \[ S_k = \sum_{i=1}^k (x_i - M_k)^2 \]
Which is achieved by:
\[ S_k = S_{k-1} + (x_k - M_{k-1})(x_k - M_k) \]
Proof:
From the identity: \[ (x_k - M_k)^2 = (x_k - M_{k-1} + M_{k-1} - M_k)^2 \]
We get: \[ (x_k - M_k)^2 = (x_k - M_{k-1})^2 - 2(x_k - M_{k-1})(M_k - M_{k-1}) + (M_k - M_{k-1})^2 \]
Substitute \( M_k - M_{k-1} = \frac{x_k - M_{k-1}}{k} \): \[ (x_k - M_k)^2 = (x_k - M_{k-1})^2 - 2(x_k - M_{k-1})\frac{x_k - M_{k-1}}{k} + \left( \frac{x_k - M_{k-1}}{k} \right)^2 \] \[ = (x_k - M_{k-1})^2 \left(1 - \frac{2}{k} + \frac{1}{k^2} \right) \]
This simplifies directly to the update formula for \( S_k \) when accumulated iteratively as: \[ S_k = S_{k-1} + (x_k - M_{k-1})(x_k - M_k) \]
Final Variance: \[ s^2 = \frac{S_n}{n-1} \]
This method is numerically stable and requires only \( O(1) \) space.
Variance Update Under Element Replacement
When a data point \( x_{\text{old}} \) in a fixed-size dataset of \( N \) values is replaced with a new value \( x_{\text{new}} \), we can efficiently update both the mean and the variance without recomputing from scratch. This is particularly useful in contexts where individual data points are corrected or modified or a sliding window is applied, where the oldest value is removed, and a new one is added at each step.
Let \( x_1, x_2, \ldots, x_N \) be a fixed sample with known mean \( \bar{x} \) and variance \( s^2 \) and suppose \( x_{\text{old}} \) is replaced by \( x_{\text{new}} \).
Updated Mean
Given the mean \( \bar{x} \) of the original dataset, the new mean \( \bar{x}' \) after replacement can be derived by:
\[ \begin{array}{rl} \bar{x}' &= \frac{1}{N} \sum_{i=1}^{N} x_i \\ &= \frac{1}{N} \left( \sum_{i=1}^{N} x_i - x_{\text{old}} + x_{\text{new}} \right) \\ &= \frac{1}{N} \left( N \cdot \bar{x} - x_{\text{old}} + x_{\text{new}} \right) \end{array} \]
Which leads to \[ \bar{x}' = \bar{x} + \frac{x_{\text{new}} - x_{\text{old}}}{N} \]
Updated Variance
We seek \( s'^2 \), the updated sample variance with \( s^2 \) being the original sample variance.
Let: \[ S = \sum_{i=1}^N (x_i - \bar{x})^2 = (N-1)s^2 \]
When \( x_{\text{old}} \) is replaced by \( x_{\text{new}} \), the new sum of squares \( S' \) is: \[ S' = S - (x_{\text{old}} - \bar{x})^2 + (x_{\text{new}} - \bar{x}')^2 \]
Hence: \[ s'^2 = \frac{S'}{N-1} = s^2 + \frac{(x_{\text{new}} - \bar{x}')^2 - (x_{\text{old}} - \bar{x})^2}{N-1} \]
Improved Updatd Variance
We now aim to express the updated variance formula without explicitly computing squares, to optimize for numerical accuracy.
We know:
\[ (x_{\text{new}} - \bar{x}')^2 = (x_{\text{new}} - \bar{x} - \frac{x_{\text{new}} - x_{\text{old}}}{N})^2 \]
Let’s simplify \( (x_{\text{new}} - \bar{x}') \):
\[ x_{\text{new}} - \bar{x}' = x_{\text{new}} - \left( \bar{x} + \frac{x_{\text{new}} - x_{\text{old}}}{N} \right) = (x_{\text{new}} - \bar{x}) - \frac{x_{\text{new}} - x_{\text{old}}}{N} \]
So:
\[ x_{\text{new}} - \bar{x}' = \frac{(N-1)(x_{\text{new}} - \bar{x}) - (x_{\text{new}} - x_{\text{old}})}{N} \]
But instead of continuing down this algebraically messy path, we can directly leverage the following simplified update formula, which is derived from comparing the definitions of variance before and after replacement:
\[ s'^2 = s^2 + \frac{(x_{\text{new}} - x_{\text{old}})(x_{\text{new}} - \bar{x}' + x_{\text{old}} - \bar{x})}{N-1} \]
JavaScript Implementation
/**
* RollingStat.js v1.0.0 4/25/2025
* https://raw.org/book/stochastics/efficient-and-accurate-rolling-variance/
*
* Copyright (c) 2025, Robert Eisele (https://raw.org/)
* Licensed under the MIT license.
**/
class RollingStat {
constructor() {
this.n = 0;
this.mean = 0;
this.M2 = 0; // sum of squared differences from the mean
}
variance() {
return this.n > 1 ? this.M2 / (this.n - 1) : 0;
}
standardDeviation() {
return Math.sqrt(this.variance());
}
add(x) {
this.n++;
if (this.n === 1) {
this.mean = x;
} else {
const delta = x - this.mean;
this.mean += delta / this.n;
this.M2 += delta * (x - this.mean);
}
}
replace(oldValue, newValue) {
if (this.n < 2) return;
const delta = newValue - oldValue;
const oldMean = this.mean;
const newMean = oldMean + delta / this.n;
this.mean = newMean;
const term1 = newValue - oldMean;
const term2 = newValue - newMean + oldValue - oldMean;
this.M2 += term1 * term2;
}
}
References
- Welford62Welford, B.P. (1962) Note on a method for calculating corrected sums of squares and products. Technometrics, 4(3), 419–420.
- Knuth97Knuth, D.E. (1997) The Art of Computer Programming, Vol. 2: Seminumerical Algorithms, 3rd ed., Addison-Wesley, p. 232.
- Chan83Chan, T.F., Golub, G.H., LeVeque, R.J. (1983) Algorithms for computing the sample variance: analysis and recommendations. The American Statistician, 37(3), 242-247.
- Ling74Ling, R.F. (1974) Comparison of several algorithms for computing sample means and variances. Journal of the American Statistical Association, 69(348), 859-866.
- Higham02Higham, N.J. (2002) Accuracy and Stability of Numerical Algorithms. SIAM.