Package com.yahoo.language.simple
Class SimpleToken
- java.lang.Object
-
- com.yahoo.language.simple.SimpleToken
-
-
Constructor Summary
Constructors Constructor Description SimpleToken(String orig)
SimpleToken(String orig, String tokenString)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description SimpleToken
addComponent(Token token)
boolean
equals(Object obj)
Token
getComponent(int i)
Returns a component token of thisint
getNumComponents()
Returns the number of components, if this token is a compound word (e.g.int
getNumStems()
Returns the number of stem forms available for this token.long
getOffset()
Returns the offset position of this tokenString
getOrig()
Returns the original form of this tokenTokenScript
getScript()
Returns the script of this tokenString
getStem(int i)
Returns the stem at position iString
getTokenString()
Returns the token string in a form suitable for indexing: The most lowercased variant of the most processed token form available, If called on a compound token this returns a lowercased form of the entire word.TokenType
getType()
Returns the type of this token - word, space or punctuation etc.int
hashCode()
boolean
isIndexable()
Whether this token should be indexedboolean
isSpecialToken()
Returns whether this is an instance of a declared special token (e.g.SimpleToken
setOffset(long offset)
SimpleToken
setScript(TokenScript script)
SimpleToken
setSpecialToken(boolean specialToken)
SimpleToken
setTokenString(String str)
SimpleToken
setType(TokenType type)
String
toString()
-
-
-
Method Detail
-
getOrig
public String getOrig()
Description copied from interface:Token
Returns the original form of this token
-
getNumStems
public int getNumStems()
Description copied from interface:Token
Returns the number of stem forms available for this token.- Specified by:
getNumStems
in interfaceToken
-
getStem
public String getStem(int i)
Description copied from interface:Token
Returns the stem at position i
-
getNumComponents
public int getNumComponents()
Description copied from interface:Token
Returns the number of components, if this token is a compound word (e.g. german "kommunikationsfehler". Otherwise, return 0- Specified by:
getNumComponents
in interfaceToken
- Returns:
- number of components, or 0 if none
-
getComponent
public Token getComponent(int i)
Description copied from interface:Token
Returns a component token of this- Specified by:
getComponent
in interfaceToken
-
addComponent
public SimpleToken addComponent(Token token)
-
getTokenString
public String getTokenString()
Description copied from interface:Token
Returns the token string in a form suitable for indexing: The most lowercased variant of the most processed token form available, If called on a compound token this returns a lowercased form of the entire word. If this is a special token with a configured replacement, this will return the replacement token.- Specified by:
getTokenString
in interfaceToken
-
setTokenString
public SimpleToken setTokenString(String str)
-
getType
public TokenType getType()
Description copied from interface:Token
Returns the type of this token - word, space or punctuation etc.
-
setType
public SimpleToken setType(TokenType type)
-
getScript
public TokenScript getScript()
Description copied from interface:Token
Returns the script of this token
-
setScript
public SimpleToken setScript(TokenScript script)
-
isSpecialToken
public boolean isSpecialToken()
Description copied from interface:Token
Returns whether this is an instance of a declared special token (e.g. c++)- Specified by:
isSpecialToken
in interfaceToken
-
setSpecialToken
public SimpleToken setSpecialToken(boolean specialToken)
-
getOffset
public long getOffset()
Description copied from interface:Token
Returns the offset position of this token
-
setOffset
public SimpleToken setOffset(long offset)
-
isIndexable
public boolean isIndexable()
Description copied from interface:Token
Whether this token should be indexed- Specified by:
isIndexable
in interfaceToken
-
-