A minimum spanning tree (MST) of a graph \(G=(V, E)\) with the vertex set \(V\) is a tree \(T=(V, E_T)\) seen as a subset \(E_T\subseteq E\) of the edges \(E\subseteq V\times V\) of a connected, edge-weighted undirected graph that connects all the vertices together, without any cycles and with the minimal possible total edge weight \(c:E\to\mathbb{R}\):
\[c(T) = \sum\limits_{e\in E_T} c(e)\text{, minimal}\]
Tarjan’s Algorithm
The basic idea of Tarjan’s algorithm is coloring edges either green or red. The edges within the MST will become green and the rest will become red. To describe the algorithm, we need to define a cut of a graph and a circle/cycle of a graph.
Graph Cut
The cut of a graph \(G=(V, E)\) is a vertex partition \((S, V- S)\). Visually it’s a cut right through the graph so that an edge \((v, w)\) crosses the cut \((S, V- S)\) if \(v\in S\) and \(w\in V- S\) or the other way round.
Graph Circle / Cycle
A cycle within graph \(G\) is a path \(v_1, v_2, ..., v_k\) with \(v_1=v_k, k\geq 3\) and \(v_i, v_{i+1}\in E\).
Green Rule
- Perform a cut on graph \(G\), that crosses no green edge.
- From the not yet colored edges take the one with least weight and make it green.
Red Rule
- Find a cycle in graph \(G\), which contains no red edges.
- From the not yet colored edges take the one with highest weight and make it red.
Tarjan’s algorithm now is the iterative application of either the red or green rule (it doesn’t matter in what order). Proof of the correctness of Tarjan’s algorithm to construct a MST in \(G\) that contains only green and no red edges can be shown by using induction and keeping the invariant:
Iteration \(0\): Graph contains MST, since all edges are uncolored. Iteration \(t\to t+1\):
- Case 1: Red Rule: Taking a cycle with no red edges yields to the selection of the edge with the highest weight, which is defenitely not in MST
- Case 2: Green Rule: With a cut we join two potential parts of the MST, which has minimal weight for the so far constructed portion of the MST
Tarjan’s algorithm is a conceptual generalization. Kruskal’s and Prim’s algorithm are typical algorithms to tackle the MST problem in real world, which can be seen as Tarjan’s algorithm with only the green rule (finding cycles is rather complex).
Kruskal’s Algorithm
- Sort edges in non-decreasing order, such that \(c(e_1)\leq c(e_2)\leq\dots\leq c(e_m)\).
- Walk through the sorted edges and add edge \(e_i\) to tree \(T\) if no circle emerges.
Kruskal’s algorithm can be seen as the application of the red rule. In practice Kruskal’s algorithm has the problem that edges must be sorted, which is bound to \(O(|E|\log(|E|))\).
Prim’s Algoritihm
- Choose an arbitrary starting vertex \(v_0\).
- Repeat until all vertices of \(G\)
are vertices of \(T\)
- Take the cheapest edge \(e\in E\) from an already visited vertex \(v\in T\) to a not visited vertex \(w\notin T\)
- Add this edge \(e\) and the newly accessible vertex \(w\) to tree \(T\).
Prim’s algorithm can be seen as the application of the green rule. In practice Prim’s algorithm has the problem to quickly find the cheapest edge, where a Fibonacci Heap can help.