Package org.predict4all.nlp.parser.token
Interface Token
-
- All Known Implementing Classes:
EquivalenceClassToken
,SeparatorToken
,TagToken
,WordToken
public interface Token
Represent the lowest unit when parsing a text. The unit can be words, but can also a special type : token separator, tags, or equivalence classes.
Token separators are the special character that are used as separators between words in natural language : space, comma, etc... (seeSeparator
)
Equivalence classes are a special "word" tokens that already have a semantic information : they can be date, numbers, etc... (seeEquivalenceClass
)
-
-
Field Summary
Fields Modifier and Type Field Description static byte
TYPE_EQUIVALENCE_CLASS
static byte
TYPE_SEPARATOR
static byte
TYPE_TAG
static byte
TYPE_WORD
-
Method Summary
All Methods Instance Methods Abstract Methods Modifier and Type Method Description void
clearNextCache()
EquivalenceClass
getEquivalenceClass()
Token
getNext(TokenProvider nextTokenProvider)
Separator
getSeparator()
Tag
getTag()
String
getText()
String
getTextForType()
int
getWordId(WordDictionary dictionary)
boolean
isEquivalenceClass()
boolean
isSeparator()
boolean
isTag()
boolean
isWord()
-
-
-
Field Detail
-
TYPE_SEPARATOR
static final byte TYPE_SEPARATOR
- See Also:
- Constant Field Values
-
TYPE_WORD
static final byte TYPE_WORD
- See Also:
- Constant Field Values
-
TYPE_EQUIVALENCE_CLASS
static final byte TYPE_EQUIVALENCE_CLASS
- See Also:
- Constant Field Values
-
TYPE_TAG
static final byte TYPE_TAG
- See Also:
- Constant Field Values
-
-
Method Detail
-
isWord
boolean isWord()
-
isSeparator
boolean isSeparator()
-
isEquivalenceClass
boolean isEquivalenceClass()
-
isTag
boolean isTag()
-
getTag
Tag getTag()
-
getText
String getText()
-
getSeparator
Separator getSeparator()
-
getNext
Token getNext(TokenProvider nextTokenProvider) throws IOException
- Throws:
IOException
-
clearNextCache
void clearNextCache()
-
getEquivalenceClass
EquivalenceClass getEquivalenceClass()
-
getTextForType
String getTextForType()
-
getWordId
int getWordId(WordDictionary dictionary)
-
-