Package org.apache.lucene.analysis.util
Utility functions for text analysis.
-
Interface Summary Interface Description MultiTermAwareComponent Add to any analysis factory component to allow returning an analysis component factory for use with partial terms in prefix queries, wildcard queries, range query endpoints, regex queries, etc.ResourceLoader Abstraction for loading resources (streams, files, and classes).ResourceLoaderAware Interface for a component that needs to be initialized by an implementation ofResourceLoader
. -
Class Summary Class Description AbstractAnalysisFactory Abstract parent class for analysis factoriesTokenizerFactory
,TokenFilterFactory
andCharFilterFactory
.CharacterUtils CharacterUtils
provides a unified interface to Character-related operations to implement backwards compatible character operations based on aVersion
instance.CharacterUtils.CharacterBuffer A simple IO buffer to use withCharacterUtils.fill(CharacterBuffer, Reader)
.CharArrayIterator A CharacterIterator used internally for use withBreakIterator
CharArrayMap<V> A simple class that stores key Strings as char[]'s in a hash table.CharArraySet A simple class that stores Strings as char[]'s in a hash table.CharFilterFactory Abstract parent class for analysis factories that createCharFilter
instances.CharTokenizer An abstract base class for simple, character-oriented tokenizers.ClasspathResourceLoader SimpleResourceLoader
that usesClassLoader.getResourceAsStream(String)
andClass.forName(String,boolean,ClassLoader)
to open resources and classes, respectively.ElisionFilter Removes elisions from aTokenStream
.ElisionFilterFactory Factory forElisionFilter
.FilesystemResourceLoader SimpleResourceLoader
that opens resource files from the local file system, optionally resolving against a base directory.FilteringTokenFilter Abstract base class for TokenFilters that may remove tokens.OpenStringBuilder A StringBuilder that allows one to access the array.RollingCharBuffer Acts like a forever growing char[] as you read characters into it from the provided reader, but internally it uses a circular buffer to only hold the characters that haven't been freed yet.StemmerUtil Some commonly-used stemming functionsStopwordAnalyzerBase Base class for Analyzers that need to make use of stopword sets.TokenFilterFactory Abstract parent class for analysis factories that createTokenFilter
instances.TokenizerFactory Abstract parent class for analysis factories that createTokenizer
instances.WordlistLoader Loader for text files that represent a list of stopwords.