Class SimpleTokenizer

java.lang.Object
opennlp.tools.tokenize.SimpleTokenizer
All Implemented Interfaces:
Tokenizer

public class SimpleTokenizer extends Object
Performs tokenization using character classes.
  • Field Details

  • Constructor Details

    • SimpleTokenizer

      @Deprecated public SimpleTokenizer()
      Deprecated.
      Use INSTANCE field instead to obtain an instance, constructor will be made private in the future.
  • Method Details

    • tokenizePos

      public Span[] tokenizePos(String s)
      Description copied from interface: Tokenizer
      Finds the boundaries of atomic parts in a string.
      Parameters:
      s - The string to be tokenized.
      Returns:
      The Span[] with the spans (offsets into s) for each token as the individuals array elements.
    • tokenize

      public String[] tokenize(String s)
      Description copied from interface: Tokenizer
      Splits a string into its atomic parts
      Specified by:
      tokenize in interface Tokenizer
      Parameters:
      s - The string to be tokenized.
      Returns:
      The String[] with the individual tokens as the array elements.