Simulated Annealing (SA) is a metaheuristic optimization algorithm inspired by the physical process of annealing in metallurgy, where a material is heated to a high temperature and then slowly cooled to decrease defects, resulting in a stable crystalline structure with minimal energy. This algorithm translates the annealing concept into an optimization technique aimed at finding the global optimum of a mathematical function, making it particularly effective for solving complex optimization problems with large search spaces and numerous local minima, such as the Traveling Salesman Problem (TSP), protein folding, graph partitioning, and job-shop scheduling.
The algorithm begins with a random initial solution at a high initial temperature (\( T_0 \)) that corresponds to a high probability of accepting worse solutions, allowing for extensive exploration of the solution space. At each temperature level, SA explores neighboring solutions, accepting both better solutions and, with a certain probability, worse ones. This probabilistic acceptance of worse solutions enables the algorithm to escape local minima. As the temperature gradually decreases over time — a process known as gradual cooling — the probability of accepting worse solutions reduces, allowing the algorithm to focus on refining the best solutions found and converge toward a global minimum without prior knowledge of the search space or gradient information.
Solution to the Traveling Salesman Problem
The Traveling Salesman Problem (TSP) is a well-known optimization challenge where a computer must determine the shortest route visiting all given cities once and returning to the start. Using Simulated Annealing, the following simulation approximates a solution by exploring possible routes and probabilistically escaping local minima to find a near-optimal path.
Mathematical Foundations
Problem definition
Let:
- Solution Space \( D \subseteq \mathbb{R}^n \): The set of all possible solutions.
- Objective Function \( f: D \rightarrow \mathbb{R} \): The function to be minimized.
- Energy Level \(E(x) = f(x)\): Energy level associated with state \(x\)
- Neighborhood Function \( U: D \rightarrow \mathcal{P}(D) \): Assigns to each point \( x \in D \) a set of neighboring solutions \( U(x) \subseteq D \).
- Temperature Function \( T: \mathbb{N}_0 \rightarrow \mathbb{R}_{>0} \): Defines the temperature at each iteration \( t \).
The goal is to find
\[ x^* = \arg \min_{x \in D} f(x) \]
Acceptance Criterion (Metropolis Criterion)
At each iteration, a new candidate solution \( x' \in U(x) \) is generated. The change in the objective function is calculated as:
\[ \Delta E = f(x') - f(x) \]
The acceptance of the new solution is determined by:
- If \( \Delta E \leq 0 \): Accept \( y \) unconditionally.
- If \( \Delta E > 0 \): Accept \( y \) with probability:
\[ P(\text{accept } y) = \exp\left(-\frac{\Delta E}{T(t)}\right) \]
This acceptance criterion is based on the Boltzmann probability distribution from statistical mechanics, where the probability of a system being in a state with energy \( E \) at temperature \( T \) is proportional to \( \exp(-E / kT) \). In Simulated Annealing, the Boltzmann constant \( k \) is typically set to 1.
Cooling Schedule
The temperature schedule determines how the temperature \( T(t) \) decreases over time. Common cooling schedules include:
Exponential Decay:
\[ T(t) = T_0 \cdot \alpha^t \quad \text{with } \alpha \in (0,1) \]
Linear Decay:
\[ T(t) = T_0 - \beta t \quad \text{with } \beta > 0 \]
Logarithmic Decay:
\[ T(t) = \frac{T_0}{\log(1 + t)} \]
The choice of the cooling schedule and parameters significantly affects the algorithm's performance, balancing exploration and exploitation.
Neighborhood Function
The neighborhood function \( U(x) \) is crucial for generating new candidate solutions that are close to the current solution but not too similar. It affects the algorithm's ability to explore the solution space effectively, if changes are too large SA behaves like random search, missing benefits from gradual improvements.
Continuous Optimization Problems
Gaussian Perturbation:
\[ x' = x + \mathcal{N}(0, \sigma^2) \]
Uniform Perturbation:
\[ x' = x + \text{Uniform}(-\delta, \delta) \]
Combinatorial Optimization Problems
- Swap: Swap two randomly selected elements.
- Flip: Flip a randomly selected bit in binary strings.
- Insertion: Remove an element and insert it at another position.
Graph-Based Problems
- Add/Remove Edge: Modify the graph structure.
- Add/Remove Vertex: Alter the graph by changing vertices.
The Simulated Annealing Algorithm
Algorithm Steps
Initialization:
- Set \( x \) as a random point in \( D \) at time \( t = 0 \).
- Set initial max temperature \( T_0 \).
- Set \( x_{\text{best}} = x \) and \( f_{\text{best}} = f(x) \).
Iteration:
For \( t = 0 \) to \( N - 1 \):
Generate Neighbor:
\[ x' \in U(x) \]
Compute Energy Difference:
\[ \Delta E = f(x') - f(x) \]
Acceptance Criterion:
- If new solution is better with \( \Delta E \leq 0 \):
\[ x \leftarrow x' \]
Else accept new solution with probability \(P(\text{accept } y)\):
Generate random number \( r \in [0,1) \).
If \( r < \exp\left(-\frac{\Delta E}{T(t)}\right) \): \[ x \leftarrow x' \]
Update Best Solution:
- If \( f(x) < f_{\text{best}} \):
\[ x_{\text{best}} \leftarrow x, \quad f_{\text{best}} \leftarrow f(x) \]
Update Temperature with Cooling Schedule:
\[ T_{t+1} = T(t) \]
Termination:
- Stop when \( T_t \leq T_{\text{min}} \) or \( f(x) \leq f_{\text{threshold}} \).
- Return \( x_{\text{best}} \) as the best found solution.
References
- KirkpatrickKirkpatrick, S., Gelatt, C. D., & Vecchi, M. P. (1983) Optimization by Simulated Annealing
- ČernýČerný, V. (1985) Thermodynamical Approach to the Traveling Salesman Problem: An Efficient Simulation Algorithm