Package com.yahoo.language.simple
Class SimpleToken
java.lang.Object
com.yahoo.language.simple.SimpleToken
- All Implemented Interfaces:
Token
- Author:
- Mathias Mølster Lidal
-
Constructor Summary
ConstructorsConstructorDescriptionSimpleToken
(String original) SimpleToken
(String original, String tokenString) -
Method Summary
Modifier and TypeMethodDescriptionaddComponent
(Token token) boolean
static SimpleToken
getComponent
(int i) Returns a component token of thisint
Returns the number of components, if this token is a compound word (e.g. german "kommunikationsfehler".int
Returns the number of stem forms available for this token.long
Returns the offset position of this tokengetOrig()
Returns the original form of this tokenReturns the script of this tokengetStem
(int i) Returns the stem at position iReturns the token string in a form suitable for indexing: The most lowercased variant of the most processed token form available, If called on a compound token this returns a lowercased form of the entire word.getType()
Returns the type of this token - word, space or punctuation etc.int
hashCode()
boolean
Whether this token should be indexedboolean
Returns whether this is an instance of a declared special token (e.g. c++)setOffset
(long offset) setScript
(TokenScript script) setSpecialToken
(boolean specialToken) setTokenString
(String str) toString()
-
Constructor Details
-
SimpleToken
-
SimpleToken
-
-
Method Details
-
getOrig
Description copied from interface:Token
Returns the original form of this token -
getNumStems
public int getNumStems()Description copied from interface:Token
Returns the number of stem forms available for this token.- Specified by:
getNumStems
in interfaceToken
-
getStem
Description copied from interface:Token
Returns the stem at position i -
getNumComponents
public int getNumComponents()Description copied from interface:Token
Returns the number of components, if this token is a compound word (e.g. german "kommunikationsfehler". Otherwise, return 0- Specified by:
getNumComponents
in interfaceToken
- Returns:
- number of components, or 0 if none
-
getComponent
Description copied from interface:Token
Returns a component token of this- Specified by:
getComponent
in interfaceToken
-
addComponent
-
getTokenString
Description copied from interface:Token
Returns the token string in a form suitable for indexing: The most lowercased variant of the most processed token form available, If called on a compound token this returns a lowercased form of the entire word. If this is a special token with a configured replacement, this will return the replacement token.- Specified by:
getTokenString
in interfaceToken
-
setTokenString
-
getType
Description copied from interface:Token
Returns the type of this token - word, space or punctuation etc. -
setType
-
getScript
Description copied from interface:Token
Returns the script of this token -
setScript
-
isSpecialToken
public boolean isSpecialToken()Description copied from interface:Token
Returns whether this is an instance of a declared special token (e.g. c++)- Specified by:
isSpecialToken
in interfaceToken
-
setSpecialToken
-
getOffset
public long getOffset()Description copied from interface:Token
Returns the offset position of this token -
setOffset
-
equals
-
hashCode
public int hashCode() -
toString
-
toDetailString
-
isIndexable
public boolean isIndexable()Description copied from interface:Token
Whether this token should be indexed- Specified by:
isIndexable
in interfaceToken
-
fromStems
-