| Constructor and Description |
|---|
SimpleTokenizer()
Creates an instance of
SimpleTokenizer with the default delimiter. |
SimpleTokenizer(java.lang.String delimiter)
Creates an instance of
SimpleTokenizer with the given delimiter. |
| Modifier and Type | Method and Description |
|---|---|
java.lang.String |
buildSentence(java.util.List<java.lang.String> tokens)
Combines a list of tokens to form a sentence.
|
java.util.List<java.lang.String> |
tokenize(java.lang.String sentence)
Breaks down the given sentence into a list of tokens that can be represented by embeddings.
|
public SimpleTokenizer(java.lang.String delimiter)
SimpleTokenizer with the given delimiter.delimiter - the delimiterpublic SimpleTokenizer()
SimpleTokenizer with the default delimiter.public java.util.List<java.lang.String> tokenize(java.lang.String sentence)
Tokenizerpublic java.lang.String buildSentence(java.util.List<java.lang.String> tokens)
TokenizerbuildSentence in interface Tokenizertokens - the List of tokens