public class UniversalTextTokenizer extends BaseTextTokenizer
A TextTokenizer
implementation for various languages.
Languages
Constructor and Description |
---|
UniversalTextTokenizer() |
Modifier and Type | Method and Description |
---|---|
void |
loadAllLanguages() |
void |
loadLanguage(Languages... languages) |
java.util.Set<java.lang.String> |
stopWords()
Gets all stop-words for a language.
|
convertWord, tokenize
public java.util.Set<java.lang.String> stopWords()
TextTokenizer
Gets all stop-words for a language.
public void loadLanguage(Languages... languages)
public void loadAllLanguages()