Class TokenizerModel

    • Constructor Detail

      • TokenizerModel

        public TokenizerModel​(MaxentModel tokenizerModel,
                              java.util.Map<java.lang.String,​java.lang.String> manifestInfoEntries,
                              TokenizerFactory tokenizerFactory)
        Initializes the current instance.
        Parameters:
        tokenizerModel - the model
        manifestInfoEntries - the manifest
        tokenizerFactory - the factory
      • TokenizerModel

        public TokenizerModel​(java.lang.String language,
                              MaxentModel tokenizerMaxentModel,
                              Dictionary abbreviations,
                              boolean useAlphaNumericOptimization,
                              java.util.Map<java.lang.String,​java.lang.String> manifestInfoEntries)
        Initializes the current instance.
        Parameters:
        language - the language the tokenizer should use
        tokenizerMaxentModel - the statistical model of the tokenizer
        abbreviations - the dictionary containing the abbreviations
        useAlphaNumericOptimization - if true alpha numeric optimization is enabled, otherwise not
        manifestInfoEntries - the additional meta data which should be written into manifest
      • TokenizerModel

        public TokenizerModel​(java.lang.String language,
                              AbstractModel tokenizerMaxentModel,
                              boolean useAlphaNumericOptimization,
                              java.util.Map<java.lang.String,​java.lang.String> manifestInfoEntries)
        Initializes the current instance.
        Parameters:
        language - the language the tokenizer should use
        tokenizerMaxentModel - the statistical model of the tokenizer
        useAlphaNumericOptimization - if true alpha numeric optimization is enabled, otherwise not
        manifestInfoEntries - the additional meta data which should be written into manifest
      • TokenizerModel

        public TokenizerModel​(java.lang.String language,
                              AbstractModel tokenizerMaxentModel,
                              boolean useAlphaNumericOptimization)
        Initializes the current instance.
        Parameters:
        language - the language the tokenizer should use
        tokenizerMaxentModel - the statistical model of the tokenizer
        useAlphaNumericOptimization - if true alpha numeric optimization is enabled, otherwise not
      • TokenizerModel

        public TokenizerModel​(java.io.InputStream in)
                       throws java.io.IOException,
                              InvalidFormatException
        Initializes the current instance.
        Parameters:
        in - the Input Stream to load the model from
        Throws:
        java.io.IOException - if reading from the stream fails in anyway
        InvalidFormatException - if the stream doesn't have the expected format
      • TokenizerModel

        public TokenizerModel​(java.io.File modelFile)
                       throws java.io.IOException,
                              InvalidFormatException
        Initializes the current instance.
        Parameters:
        modelFile - the file containing the tokenizer model
        Throws:
        java.io.IOException - if reading from the stream fails in anyway
        InvalidFormatException - if the stream doesn't have the expected format
      • TokenizerModel

        public TokenizerModel​(java.net.URL modelURL)
                       throws java.io.IOException,
                              InvalidFormatException
        Initializes the current instance.
        Parameters:
        modelURL - the URL pointing to the tokenizer model
        Throws:
        java.io.IOException - if reading from the stream fails in anyway
        InvalidFormatException - if the stream doesn't have the expected format
    • Method Detail

      • getAbbreviations

        public Dictionary getAbbreviations()
      • useAlphaNumericOptimization

        public boolean useAlphaNumericOptimization()
      • main

        public static void main​(java.lang.String[] args)
                         throws java.io.IOException
        Throws:
        java.io.IOException