Package org.apache.lucene.analysis.util
package org.apache.lucene.analysis.util
Utility functions for text analysis.
-
ClassDescriptionAbstract parent class for analysis factories
TokenizerFactory
,TokenFilterFactory
andCharFilterFactory
.CharacterUtils
provides a unified interface to Character-related operations to implement backwards compatible character operations based on aVersion
instance.A simple IO buffer to use withCharacterUtils.fill(CharacterBuffer, Reader)
.A CharacterIterator used internally for use withBreakIterator
CharArrayMap<V>A simple class that stores key Strings as char[]'s in a hash table.A simple class that stores Strings as char[]'s in a hash table.Abstract parent class for analysis factories that createCharFilter
instances.An abstract base class for simple, character-oriented tokenizers.SimpleResourceLoader
that usesClassLoader.getResourceAsStream(String)
andClass.forName(String,boolean,ClassLoader)
to open resources and classes, respectively.Removes elisions from aTokenStream
.Factory forElisionFilter
.SimpleResourceLoader
that opens resource files from the local file system, optionally resolving against a base directory.Abstract base class for TokenFilters that may remove tokens.Add to any analysis factory component to allow returning an analysis component factory for use with partial terms in prefix queries, wildcard queries, range query endpoints, regex queries, etc.A StringBuilder that allows one to access the array.Abstraction for loading resources (streams, files, and classes).Interface for a component that needs to be initialized by an implementation ofResourceLoader
.Acts like a forever growing char[] as you read characters into it from the provided reader, but internally it uses a circular buffer to only hold the characters that haven't been freed yet.Some commonly-used stemming functionsBase class for Analyzers that need to make use of stopword sets.Abstract parent class for analysis factories that createTokenFilter
instances.Abstract parent class for analysis factories that createTokenizer
instances.Loader for text files that represent a list of stopwords.