Class OpenNlpLinguistics

  • All Implemented Interfaces:
    Linguistics

    public class OpenNlpLinguistics
    extends SimpleLinguistics
    Returns a linguistics implementation based on OpenNlp, and (optionally, default on) Optimaize for language detection.
    • Constructor Detail

      • OpenNlpLinguistics

        public OpenNlpLinguistics()
      • OpenNlpLinguistics

        public OpenNlpLinguistics​(boolean enableOptimaize)
    • Method Detail

      • getTokenizer

        public Tokenizer getTokenizer()
        Description copied from interface: Linguistics
        Returns a thread-unsafe tokenizer. This is used at indexing time to produce a optionally stemmed and transformed (accent normalized) stream of indexable tokens.
        Specified by:
        getTokenizer in interface Linguistics
        Overrides:
        getTokenizer in class SimpleLinguistics
      • getDetector

        public Detector getDetector()
        Description copied from interface: Linguistics
        Returns a thread-unsafe detector. The language of the text is a parameter to other linguistic operations. This is used to determine the language of a query or document field when not specified explicitly.
        Specified by:
        getDetector in interface Linguistics
        Overrides:
        getDetector in class SimpleLinguistics