Interface Token

  • All Known Implementing Classes:
    EquivalenceClassToken, SeparatorToken, TagToken, WordToken

    public interface Token
    Represent the lowest unit when parsing a text. The unit can be words, but can also a special type : token separator, tags, or equivalence classes.
    Token separators are the special character that are used as separators between words in natural language : space, comma, etc... (see Separator)
    Equivalence classes are a special "word" tokens that already have a semantic information : they can be date, numbers, etc... (see EquivalenceClass)
    • Method Detail

      • isWord

        boolean isWord()
      • isSeparator

        boolean isSeparator()
      • isEquivalenceClass

        boolean isEquivalenceClass()
      • isTag

        boolean isTag()
      • getTag

        Tag getTag()
      • getText

        java.lang.String getText()
      • getNext

        Token getNext​(TokenProvider nextTokenProvider)
               throws java.io.IOException
        Throws:
        java.io.IOException
      • clearNextCache

        void clearNextCache()
      • getTextForType

        java.lang.String getTextForType()