Introduction to Monte Carlo integration

Imagine we want to measure the area of a pond of arbitrary shape in the middle of a field with a known area \(A\). If we throw \(N\) stones randomly into the known area and count the number the stone falls into the pond \(N_{\text{pond}}\), the area of the pond gets approximated better and better the more stones we throw with

\[A_{\text{pond}} = \frac{N_{\text{pond}}}{N}A\]

This procedure is a simple example of the Monte Carlo integration. But we can formalize this approach.

Given a real function \(f:[a, b]\to[0,\infty)\) and a probability density function \(p(x)\) such that \(\int\limits_a^bp(x)dx=1\). The integral \(I(f)=\int\limits_a^bf(x)dx\) can be approximated like this:

\[\begin{array}{rl} I(f)=&\int\limits_a^bf(x)dx\\ =&\int\limits_a^b\underbrace{\frac{f(x)}{p(x)}}_{g(x)}p(x)dx\\ =&\mathbb{E}[g(x)]\\ \overset{\text{estimate}}{=}&\lim\limits_{N\to\infty}\frac{1}{N}\sum\limits_{i=1}^Ng(x_i)\\ =&\lim\limits_{N\to\infty}I_N(f) \end{array} \]

It follows that for the case that \(x_i\sim \text{unif}(a, b)\),

\[I_n(f) = \frac{1}{N}\sum\limits_{i=1}^N\frac{f(x_i)}{p(x_i)} = \frac{b-a}{N}\sum\limits_{i=1}^Nf(x_i)\]

Error analysis

The Monte Carlo method clearly yields an approximation. The accuracy depends on the number of samples \(N\) that are used to average. A possible measure of this error is the variance

\[\begin{array}{rl} Var[I_n(f)] &= \mathbb{E}[I_n(f)^2]-\mathbb{E}[I_n(f)]^2\\ &= \mathbb{E}\left[\left(\frac{1}{N}\sum\limits_{i=1}^Ng(x_i)\right)^2\right]-\mathbb{E}\left[\frac{1}{N}\sum\limits_{i=1}^Ng(x_i)\right]^2 \end{array}\]

We should expect that the error decreases with the number of samples \(N\), but \(Var[I_n(f)]\) does not. A solution to this problem is performing the experiment \(I_n(f)\) up to \(M\) times. According to the central limit theorem, these values would be normally distributed around an expected value \(\mathbb{E}[I]\). Suppose we have a set of \(M\) of such \(I_n(f)\), a convenient measure is the variance deviation of the means\(Var_M[I_n(f)]\):

\[\begin{array}{rl} Var_M[I_n(f)] &= \mathbb{E}[I^2]-\mathbb{E}[I]^2\\ &= \frac{1}{M}\sum\limits_{n=1}^MI^2_n - \left(\frac{1}{M}\sum\limits_{n=1}^MI_n\right)^2 \end{array}\]

Although \(Var_M[I_n(f)]\) gives an estimate of the actual error, making multiple measurements is not practical. But this is not needed, since it can be shown that

\[Var_M[I_n(f)]\approx\sqrt{\frac{1}{N}Var[I_n(f)]}\]

This relation becomes exact in the limit \(N\). This expression implies, that the error decreases with the square root of the number of trials, meaning if we want to reduce the error by factor 10, we need 100 times more points.

Variance reduction

The quality of the approximation of the integral highly depends on the number of samples \(N\). How can we reduce the noise with a moderate number of samples?

Importance sampling

With importance sampling, we define the probability density function \(p(x)\) according to the function \(f(x)\) so that large values of \(f(x)\) are also large in \(p(x)\). The optimal

\[p_{\text{optimal}}(x)\propto f(x)\]

For example the function \(f(x)=e^{-x^2}\) in the interval \([0,1]\) will have most points in a region where the value of \(f(x)\) is very small, which requires a large number of samples to achieve a decent accuracy. A reasonable choice for \(p\) is \(p(x)=\frac{1}{z}e^{-x^2}\) where \(z=\int_0^1 e^{-x^2} dx = \frac{1}{2} \sqrt\pi \text{erf}(1)\approx 0.746824\)

Stratified sampling

Rejection sampling

Examples

Constant function

Given a constant function \(f(x) = k\) and a uniform probability density function. We can calculate the integral of the function analytically \(\int\limits_a^bf(x)dx = \int\limits_a^bkdx = k(b-a)\) or with Monte Carlo:

\[\int\limits_a^bf(x)dx = \int\limits_a^bkdx \approx \frac{1}{N}\sum\limits_{i=1}^Nk(b-a)=k(b-a)\]

Approximating PI Using Monte Carlo Integration

One elegant application of Monte Carlo integration is the estimation of the mathematical constant \(\pi\). The idea is based on comparing the area of a quarter-circle to the area of a square that bounds it, and using random sampling to estimate the proportion between the two.

Geometric Setup

Consider a unit circle centered at the origin. The equation of this circle is:

\[ x^2 + y^2 = 1 \]

Focusing on the first quadrant, we look at only those points where \(x \in [0,1]\) and \(y \in [0,1]\). The area under the curve within this quadrant forms a quarter-circle, which is bounded within a 1×1 square. The area of this quarter-circle is:

\[ A_{\text{circle}} = \frac{\pi r^2}{4} = \frac{\pi}{4}, \quad \text{(with } r = 1\text{)} \]

Since the bounding square has area:

\[ A_{\text{square}} = 1 \times 1 = 1 \]

the ratio of the areas is:

\[ \frac{A_{\text{circle}}}{A_{\text{square}}} = \frac{\pi}{4} \]

From this, we can solve for π:

\[ \pi = 4 \cdot \frac{A_{\text{circle}}}{A_{\text{square}}} \]

Monte Carlo Simulation

To estimate this ratio, we use random sampling. We generate \(N\) uniformly distributed points \((x_i, y_i)\) within the square \([0,1]^2\), and count how many fall inside the quarter-circle:

\[ f(x_i, y_i) = \begin{cases} 1, & \text{if } x_i^2 + y_i^2 \leq 1 \\ 0, & \text{otherwise} \end{cases} \]

The integral representing the area of the quarter-circle becomes:

\[ \int_0^1 \int_0^1 f(x, y) \, dx \, dy \approx \frac{1}{N} \sum_{i=1}^{N} f(x_i, y_i) \]

Therefore, our approximation of \(\pi\) is:

\[ \pi \approx \frac{4}{N} \sum_{i=1}^{N} f(x_i, y_i) \]

Implementation

An implementation for the uniform distributed Monte Carlo algorithm can be implemented as follows

function integrate(f, a, b, N) {
  let sum = 0
  for (let i = 0; i < N; i++) {
    sum+= f(a + Math.random() * (b - a));
  }
  return sum / N * (b - a);
}