Class Token

All Implemented Interfaces:
Serializable

public class Token extends Attribute implements Serializable
The token. The definition of a token can vary by language, but generally a token corresponds to a word.
See Also:
  • Constructor Details

  • Method Details

    • getText

      public String getText()
      Returns the text of the token. Note that, in some languages, the text may not be a substring of the character data stored in the AnnotatedText. For example, a Chinese token could start at the end of a line and continue to the next line. The raw text would include the newline character, but the token would not.
      Returns:
      the text of the token
    • getNormalized

      public List<String> getNormalized()
      Returns the normalized form of the token.
      Returns:
      the normalized form of the token
    • getAnalyses

      public List<MorphoAnalysis> getAnalyses()
      Returns the list of analyses. Note: the items of this list are of the smallest type needed. So, even if the text is Arabic or Chinese, some of the items in this list may be MorphoAnalysis, not the corresponding subclass. Callers must use instanceof to check if a particular item is of the subclass.
      Returns:
      the list of analyses
    • getSource

      public String getSource()
      Returns the source of this token. This identifies the component that performed the tokenization.
      Returns:
      the source of this token
    • toStringHelper

      protected com.google.common.base.MoreObjects.ToStringHelper toStringHelper()
      Overrides:
      toStringHelper in class Attribute