Constructor and Description |
---|
SimpleTokenizer()
Creates an instance of
SimpleTokenizer with the default delimiter. |
SimpleTokenizer(java.lang.String delimiter)
Creates an instance of
SimpleTokenizer with the given delimiter. |
Modifier and Type | Method and Description |
---|---|
java.lang.String |
buildSentence(java.util.List<java.lang.String> tokens)
Combines a list of tokens to form a sentence.
|
java.util.List<java.lang.String> |
tokenize(java.lang.String sentence)
Breaks down the given sentence into a list of tokens that can be represented by embeddings.
|
public SimpleTokenizer(java.lang.String delimiter)
SimpleTokenizer
with the given delimiter.delimiter
- the delimiterpublic SimpleTokenizer()
SimpleTokenizer
with the default delimiter.public java.util.List<java.lang.String> tokenize(java.lang.String sentence)
Tokenizer
public java.lang.String buildSentence(java.util.List<java.lang.String> tokens)
Tokenizer
buildSentence
in interface Tokenizer
tokens
- the List
of tokens