Class HyphenationTree

java.lang.Object
org.apache.lucene.analysis.compound.hyphenation.TernaryTree
org.apache.lucene.analysis.compound.hyphenation.HyphenationTree
All Implemented Interfaces:
Cloneable, PatternConsumer

public class HyphenationTree extends TernaryTree implements PatternConsumer
This tree structure stores the hyphenation patterns in an efficient way for fast lookup. It provides the provides the method to hyphenate a word. This class has been taken from the Apache FOP project (http://xmlgraphics.apache.org/fop/). They have been slightly modified.
  • Constructor Details

    • HyphenationTree

      public HyphenationTree()
  • Method Details

    • loadPatterns

      public void loadPatterns(File f) throws IOException
      Read hyphenation patterns from an XML file.
      Parameters:
      f - the filename
      Throws:
      IOException - In case the parsing fails
    • loadPatterns

      public void loadPatterns(InputSource source) throws IOException
      Read hyphenation patterns from an XML file.
      Parameters:
      source - the InputSource for the file
      Throws:
      IOException - In case the parsing fails
    • findPattern

      public String findPattern(String pat)
    • hyphenate

      public Hyphenation hyphenate(String word, int remainCharCount, int pushCharCount)
      Hyphenate word and return a Hyphenation object.
      Parameters:
      word - the word to be hyphenated
      remainCharCount - Minimum number of characters allowed before the hyphenation point.
      pushCharCount - Minimum number of characters allowed after the hyphenation point.
      Returns:
      a Hyphenation object representing the hyphenated word or null if word is not hyphenated.
    • hyphenate

      public Hyphenation hyphenate(char[] w, int offset, int len, int remainCharCount, int pushCharCount)
      Hyphenate word and return an array of hyphenation points.
      Parameters:
      w - char array that contains the word
      offset - Offset to first character in word
      len - Length of word
      remainCharCount - Minimum number of characters allowed before the hyphenation point.
      pushCharCount - Minimum number of characters allowed after the hyphenation point.
      Returns:
      a Hyphenation object representing the hyphenated word or null if word is not hyphenated.
    • addClass

      public void addClass(String chargroup)
      Add a character class to the tree. It is used by PatternParser as callback to add character classes. Character classes define the valid word characters for hyphenation. If a word contains a character not defined in any of the classes, it is not hyphenated. It also defines a way to normalize the characters in order to compare them with the stored patterns. Usually pattern files use only lower case characters, in this case a class for letter 'a', for example, should be defined as "aA", the first character being the normalization char.
      Specified by:
      addClass in interface PatternConsumer
      Parameters:
      chargroup - character group
    • addException

      public void addException(String word, ArrayList<Object> hyphenatedword)
      Add an exception to the tree. It is used by PatternParser class as callback to store the hyphenation exceptions.
      Specified by:
      addException in interface PatternConsumer
      Parameters:
      word - normalized word
      hyphenatedword - a vector of alternating strings and hyphen objects.
    • addPattern

      public void addPattern(String pattern, String ivalue)
      Add a pattern to the tree. Mainly, to be used by PatternParser class as callback to add a pattern to the tree.
      Specified by:
      addPattern in interface PatternConsumer
      Parameters:
      pattern - the hyphenation pattern
      ivalue - interletter weight values indicating the desirability and priority of hyphenating at a given point within the pattern. It should contain only digit characters. (i.e. '0' to '9').
    • printStats

      public void printStats(PrintStream out)
      Overrides:
      printStats in class TernaryTree