raw book

Needleman-Wunsch Algorithm

Robert Eisele

The Needleman-Wunsch Algorithm determines the global similarity score of two strings \(a, b\in\Sigma^*\) and with backtracking possibly various optimal global alignments of these strings, which is often used in Bioinformatics to align amino acid sequences of two proteins. The algorithm allows an arbitrary scoring model, which determines the cost of edit operations. Calculating the optimal global alignment is a complex task, but the recursive nature of the algorithm allows the formulation of a dynamic programming problem. But even with dynamic programming, the general Needleman-Wunsch Algorithm has a runtime of \(O(n^3)\). In a later paper, Sellers proposed an explicit recursive formulation of the algorithm with constant gap costs, which reduces the runtime to \(O(n^2)\):

Sequence \(a\)
Sequence \(b\)
Optimize for distance
Optimize for distance with skew penalty
Optimize for similarity
For similarity maximization, match scores should be positive and all other scores lower. For distance minimization the reverse should be used.
Match reward
Mismatch penality
Gap penalty
Skew penalty