public class NeedlemanWunsch extends PairwiseAlignmentAlgorithm
It is based on a dynamic programming approach. The idea consists of, given two sequences A and B of sizes n and m, respectively, building an (n+1 x m+1) matrix M that contains the similarity of prefixes of A and B. Every position M[i,j] in the matrix holds the score between the subsequences A[1..i] and B[1..j]. The first row and column represent alignments with spaces.
Starting from row 0, column 0, the algorithm computes each position M[i,j] with the following recurrence:
M[0,0] = 0
M[i,j] = max { M[i,j-1] + scoreInsertion (B[j]),
M[i-1,j-1] + scoreSubstitution (A[i], B[j]),
M[i-1,j] + scoreDeletion(A[i]) }
In the end, the value at the last position (last row, last column) will contain
the similarity between the two sequences. This part of the algorithm is accomplished
by the computeMatrix
method. It has quadratic space complexity
since it needs to keep an (n+1 x m+1) matrix in memory. And since the work of computing
each cell is constant, it also has quadratic time complexity.
After the matrix has been computed, the alignment can be retrieved by tracing a path
back in the matrix from the last position to the first. This step is performed by
the buildOptimalAlignment
method, and since the path can
be roughly as long as (m + n), this method has O(n) time complexity.
If the similarity value only is needed (and not the alignment itself), it is easy to
reduce the space requirement to O(n) by keeping just the last row or column in memory.
This is precisely what is done by the computeScore
method. Note
that it still requires O(n2) time.
For a more efficient approach to the global alignment problem, see the CrochemoreLandauZivUkelson algorithm. For local alignment, see the SmithWaterman algorithm.
SmithWaterman
,
CrochemoreLandauZivUkelson
,
CrochemoreLandauZivUkelsonLocalAlignment
,
CrochemoreLandauZivUkelsonGlobalAlignment
Constructor and Description |
---|
NeedlemanWunsch() |
getPairwiseAlignment, getScore, loadSequences, setScoringScheme, unloadSequences
Copyright © 2018. All rights reserved.