All Classes
-
All Classes Interface Summary Class Summary Enum Summary Exception Summary Class Description AbstractDetector CharacterClasses Determines the class of a given character.CharacterUtils CharacterUtils
provides a unified interface to Character-related operations to implement backwards compatible character operations.CharacterUtils.CharacterBuffer A simple IO buffer to use withCharacterUtils.fill(CharacterBuffer, Reader)
.CharArrayMap<V> A simple class that stores key Strings as char[]'s in a hash table.CharArraySet A simple class that stores Strings as char[]'s in a hash table.Detection DetectionException Exception that is thrown when detection fails.Detector Abstract superclass of all Detectors used for language and encoding detection.Embedder An embedder converts a text string to a tensorEmbedder.Context Embedder.FailingEmbedder GramSplitter A class which splits consecutive word character sequences into overlapping character n-grams.GramSplitter.Gram An immutable start index and length pairGramSplitter.GramSplitterIterator Hint A hint that can be given to aDetector
.KStemmer A stemmer implementing the Kstem algorithm by Bob Krovetz.Language Linguistics Factory of linguistic processors.Linguistics.Component LinguisticsCase This class provides a case normalization operation to be used e.g.LocaleFactory Normalizer This interface provides NFKC normalization of Strings through the underlying linguistics library.OpenNlpLinguistics Returns a linguistics implementation based on OpenNlp, and (optionally, default on) Optimaize for language detection.OpennlpLinguisticsConfig This class represents the root node of opennlp-linguistics Copyright Yahoo.OpennlpLinguisticsConfig.Builder OpennlpLinguisticsConfig.Detector This class represents opennlp-linguistics.detectorOpennlpLinguisticsConfig.Detector.Builder OpennlpLinguisticsConfig.Producer OpenNlpTokenizer Tokenizer using OpenNlpOpenStringBuilder A StringBuilder that allows one to access the array.OptimaizeDetector Detects the language of some sample text using SimpleDetector for CJK and Optimaize otherwise.ProcessingException Exception class indicating that a fatal error occured during linguistic processing.Segmenter Interface providing segmentation, i.e.SegmenterImpl SimpleDetector Includes functionality for determining the langCode from a sample or from the encoding.SimpleLinguistics Factory of simple linguistic processor implementations.SimpleNormalizer SimpleToken SimpleTokenizer A tokenizer which splits on whitespace, normalizes and transforms using the given implementations and stems using the kstem algorithm.SimpleTokenType SimpleTransformer Converts all accented characters into their de-accented counterparts followed by their combining diacritics, then strips off the diacritics using a regex.SpecialTokenRegistry Immutable named lists of "special tokens" - strings which should override the normal tokenizer semantics and be tokenized into a single token.SpecialTokens An immutable list of special tokens - strings which should override the normal tokenizer semantics and be tokenized into a single token.SpecialTokens.Token An immutable special tokenStemList A list of strings which does not allow for duplicate elements.Stemmer Interface providing stemming of single words.StemmerImpl StemMode An enum of the stemming modes which can be requested.Token A single token produced by the tokenizer.Tokenizer Language-sensitive tokenization of a text string.TokenScript List of token scripts (e.g.TokenType An enumeration of token types.Transformer Interface for providers of text transformations such as accent removal.