Package | Description |
---|---|
weka.core.tokenizers | |
weka.filters.unsupervised.attribute |
Modifier and Type | Class and Description |
---|---|
class |
AlphabeticTokenizer
Alphabetic string tokenizer, tokens are to be formed only from contiguous alphabetic sequences.
|
class |
CharacterDelimitedTokenizer
Abstract superclass for tokenizers that take characters as delimiters.
|
class |
NGramTokenizer
Splits a string into an n-gram with min and max
grams.
|
class |
WordTokenizer
A simple tokenizer that is using the java.util.StringTokenizer class to tokenize the strings.
|
Modifier and Type | Method and Description |
---|---|
static void |
Tokenizer.runTokenizer(Tokenizer tokenizer,
String[] options)
initializes the given tokenizer with the given options and runs the
tokenizer over all the remaining strings in the options array.
|
static String[] |
Tokenizer.tokenize(Tokenizer tokenizer,
String[] options)
initializes the given tokenizer with the given options and runs the
tokenizer over all the remaining strings in the options array.
|
Modifier and Type | Method and Description |
---|---|
Tokenizer |
StringToWordVector.getTokenizer()
Returns the current tokenizer algorithm.
|
Modifier and Type | Method and Description |
---|---|
void |
StringToWordVector.setTokenizer(Tokenizer value)
the tokenizer algorithm to use.
|
Copyright © 2016 University of Waikato, Hamilton, NZ. All Rights Reserved.