Package opennlp.tools.tokenize
Class SimpleTokenizer
java.lang.Object
opennlp.tools.tokenize.SimpleTokenizer
- All Implemented Interfaces:
Tokenizer
Performs tokenization using character classes.
-
Field Summary
Fields -
Constructor Summary
ConstructorsConstructorDescriptionDeprecated.Use INSTANCE field instead to obtain an instance, constructor will be made private in the future. -
Method Summary
-
Field Details
-
INSTANCE
-
-
Constructor Details
-
SimpleTokenizer
Deprecated.Use INSTANCE field instead to obtain an instance, constructor will be made private in the future.
-
-
Method Details
-
tokenizePos
Description copied from interface:Tokenizer
Finds the boundaries of atomic parts in a string.- Parameters:
s
- The string to be tokenized.- Returns:
- The Span[] with the spans (offsets into s) for each token as the individuals array elements.
-
tokenize
Description copied from interface:Tokenizer
Splits a string into its atomic parts
-