Package opennlp.tools.tokenize
Class TokenizerModel
- java.lang.Object
-
- opennlp.tools.util.model.BaseModel
-
- opennlp.tools.tokenize.TokenizerModel
-
- All Implemented Interfaces:
ArtifactProvider
public final class TokenizerModel extends BaseModel
TheTokenizerModel
is the model used by a learnableTokenizer
.- See Also:
TokenizerME
-
-
Field Summary
-
Fields inherited from class opennlp.tools.util.model.BaseModel
TRAINING_CUTOFF_PROPERTY, TRAINING_EVENTHASH_PROPERTY, TRAINING_ITERATIONS_PROPERTY
-
-
Constructor Summary
Constructors Constructor Description TokenizerModel(java.io.File modelFile)
Initializes the current instance.TokenizerModel(java.io.InputStream in)
Initializes the current instance.TokenizerModel(java.lang.String language, AbstractModel tokenizerMaxentModel, boolean useAlphaNumericOptimization)
Deprecated.UseTokenizerModel(MaxentModel, Map, TokenizerFactory)
instead and pass in aTokenizerFactory
.TokenizerModel(java.lang.String language, AbstractModel tokenizerMaxentModel, boolean useAlphaNumericOptimization, java.util.Map<java.lang.String,java.lang.String> manifestInfoEntries)
Deprecated.UseTokenizerModel(MaxentModel, Map, TokenizerFactory)
instead and pass in aTokenizerFactory
.TokenizerModel(java.lang.String language, MaxentModel tokenizerMaxentModel, Dictionary abbreviations, boolean useAlphaNumericOptimization, java.util.Map<java.lang.String,java.lang.String> manifestInfoEntries)
Deprecated.UseTokenizerModel(MaxentModel, Map, TokenizerFactory)
instead and pass in aTokenizerFactory
.TokenizerModel(java.net.URL modelURL)
Initializes the current instance.TokenizerModel(MaxentModel tokenizerModel, java.util.Map<java.lang.String,java.lang.String> manifestInfoEntries, TokenizerFactory tokenizerFactory)
Initializes the current instance.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description Dictionary
getAbbreviations()
TokenizerFactory
getFactory()
MaxentModel
getMaxentModel()
static void
main(java.lang.String[] args)
boolean
useAlphaNumericOptimization()
-
Methods inherited from class opennlp.tools.util.model.BaseModel
getArtifact, getLanguage, getManifestProperty, getVersion, isLoadedFromSerialized, serialize
-
-
-
-
Constructor Detail
-
TokenizerModel
public TokenizerModel(MaxentModel tokenizerModel, java.util.Map<java.lang.String,java.lang.String> manifestInfoEntries, TokenizerFactory tokenizerFactory)
Initializes the current instance.- Parameters:
tokenizerModel
- the modelmanifestInfoEntries
- the manifesttokenizerFactory
- the factory
-
TokenizerModel
public TokenizerModel(java.lang.String language, MaxentModel tokenizerMaxentModel, Dictionary abbreviations, boolean useAlphaNumericOptimization, java.util.Map<java.lang.String,java.lang.String> manifestInfoEntries)
Deprecated.UseTokenizerModel(MaxentModel, Map, TokenizerFactory)
instead and pass in aTokenizerFactory
.Initializes the current instance.- Parameters:
language
- the language the tokenizer should usetokenizerMaxentModel
- the statistical model of the tokenizerabbreviations
- the dictionary containing the abbreviationsuseAlphaNumericOptimization
- if true alpha numeric optimization is enabled, otherwise notmanifestInfoEntries
- the additional meta data which should be written into manifest
-
TokenizerModel
public TokenizerModel(java.lang.String language, AbstractModel tokenizerMaxentModel, boolean useAlphaNumericOptimization, java.util.Map<java.lang.String,java.lang.String> manifestInfoEntries)
Deprecated.UseTokenizerModel(MaxentModel, Map, TokenizerFactory)
instead and pass in aTokenizerFactory
.Initializes the current instance.- Parameters:
language
- the language the tokenizer should usetokenizerMaxentModel
- the statistical model of the tokenizeruseAlphaNumericOptimization
- if true alpha numeric optimization is enabled, otherwise notmanifestInfoEntries
- the additional meta data which should be written into manifest
-
TokenizerModel
public TokenizerModel(java.lang.String language, AbstractModel tokenizerMaxentModel, boolean useAlphaNumericOptimization)
Deprecated.UseTokenizerModel(MaxentModel, Map, TokenizerFactory)
instead and pass in aTokenizerFactory
.Initializes the current instance.- Parameters:
language
- the language the tokenizer should usetokenizerMaxentModel
- the statistical model of the tokenizeruseAlphaNumericOptimization
- if true alpha numeric optimization is enabled, otherwise not
-
TokenizerModel
public TokenizerModel(java.io.InputStream in) throws java.io.IOException, InvalidFormatException
Initializes the current instance.- Parameters:
in
- the Input Stream to load the model from- Throws:
java.io.IOException
- if reading from the stream fails in anywayInvalidFormatException
- if the stream doesn't have the expected format
-
TokenizerModel
public TokenizerModel(java.io.File modelFile) throws java.io.IOException, InvalidFormatException
Initializes the current instance.- Parameters:
modelFile
- the file containing the tokenizer model- Throws:
java.io.IOException
- if reading from the stream fails in anywayInvalidFormatException
- if the stream doesn't have the expected format
-
TokenizerModel
public TokenizerModel(java.net.URL modelURL) throws java.io.IOException, InvalidFormatException
Initializes the current instance.- Parameters:
modelURL
- the URL pointing to the tokenizer model- Throws:
java.io.IOException
- if reading from the stream fails in anywayInvalidFormatException
- if the stream doesn't have the expected format
-
-
Method Detail
-
getFactory
public TokenizerFactory getFactory()
-
getMaxentModel
public MaxentModel getMaxentModel()
-
getAbbreviations
public Dictionary getAbbreviations()
-
useAlphaNumericOptimization
public boolean useAlphaNumericOptimization()
-
main
public static void main(java.lang.String[] args) throws java.io.IOException
- Throws:
java.io.IOException
-
-