Class CharSequenceSuffixTrees


  • public class CharSequenceSuffixTrees
    extends java.lang.Object
    Utilities for working with suffix trees of sequences of characters.
    Author:
    Garret Wilson
    • Method Summary

      All Methods Static Methods Concrete Methods 
      Modifier and Type Method Description
      static java.lang.CharSequence getLongestRepeatedSubsequence​(java.lang.CharSequence charSequence)
      Determines the longest subsequence that is repeated in the given subsequence.
      static java.lang.CharSequence getLongestSequentialRepeatedSubsequence​(java.lang.CharSequence charSequence)
      Determines the longest subsequence that is repeated in the given subsequence.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Constructor Detail

      • CharSequenceSuffixTrees

        public CharSequenceSuffixTrees()
    • Method Detail

      • getLongestRepeatedSubsequence

        public static java.lang.CharSequence getLongestRepeatedSubsequence​(java.lang.CharSequence charSequence)
        Determines the longest subsequence that is repeated in the given subsequence.

        This implementation walks the tree and finds the non-leaf node that is farthest in terms of characters from the root of the tree. The repeated subsequence is the sequence of characters from the root to that node.

        Parameters:
        charSequence - The character sequence to check.
        Returns:
        The longest repeated subsequence in the given character sequence, or null if no subsequence is repeated.
        Throws:
        java.lang.NullPointerException - if the given character sequence is null.
      • getLongestSequentialRepeatedSubsequence

        public static java.lang.CharSequence getLongestSequentialRepeatedSubsequence​(java.lang.CharSequence charSequence)
        Determines the longest subsequence that is repeated in the given subsequence.

        This implementation walks the tree and, for every non-leaf node (which indicates a repeated sequence as the sequence of characters from the root to that node), determines if there exists the exact sequence starting with the node just found (which indicates that the repeated sequence is followed by an identical sequence). This process continues until the longest of these sequences is determined.

        Parameters:
        charSequence - The character sequence to check.
        Returns:
        The longest repeated subsequence in the given character sequence, or null if no subsequence is repeated.
        Throws:
        java.lang.NullPointerException - if the given character sequence is null.