Package org.apache.lucene.analysis.core
Class StopAnalyzer
- java.lang.Object
-
- org.apache.lucene.analysis.Analyzer
-
- org.apache.lucene.analysis.util.StopwordAnalyzerBase
-
- org.apache.lucene.analysis.core.StopAnalyzer
-
- All Implemented Interfaces:
java.io.Closeable
,java.lang.AutoCloseable
public final class StopAnalyzer extends StopwordAnalyzerBase
FiltersLetterTokenizer
withLowerCaseFilter
andStopFilter
.You must specify the required
Version
compatibility when creating StopAnalyzer:- As of 3.1, StopFilter correctly handles Unicode 4.0 supplementary characters in stopwords
- As of 2.9, position increments are preserved
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from class org.apache.lucene.analysis.Analyzer
Analyzer.GlobalReuseStrategy, Analyzer.PerFieldReuseStrategy, Analyzer.ReuseStrategy, Analyzer.TokenStreamComponents
-
-
Field Summary
Fields Modifier and Type Field Description static CharArraySet
ENGLISH_STOP_WORDS_SET
An unmodifiable set containing some common English words that are not usually useful for searching.-
Fields inherited from class org.apache.lucene.analysis.Analyzer
GLOBAL_REUSE_STRATEGY, PER_FIELD_REUSE_STRATEGY
-
-
Constructor Summary
Constructors Constructor Description StopAnalyzer(Version matchVersion)
Builds an analyzer which removes words inENGLISH_STOP_WORDS_SET
.StopAnalyzer(Version matchVersion, java.io.File stopwordsFile)
Builds an analyzer with the stop words from the given file.StopAnalyzer(Version matchVersion, java.io.Reader stopwords)
Builds an analyzer with the stop words from the given reader.StopAnalyzer(Version matchVersion, CharArraySet stopWords)
Builds an analyzer with the stop words from the given set.
-
Method Summary
-
Methods inherited from class org.apache.lucene.analysis.util.StopwordAnalyzerBase
getStopwordSet
-
Methods inherited from class org.apache.lucene.analysis.Analyzer
close, getOffsetGap, getPositionIncrementGap, getReuseStrategy, tokenStream, tokenStream
-
-
-
-
Field Detail
-
ENGLISH_STOP_WORDS_SET
public static final CharArraySet ENGLISH_STOP_WORDS_SET
An unmodifiable set containing some common English words that are not usually useful for searching.
-
-
Constructor Detail
-
StopAnalyzer
public StopAnalyzer(Version matchVersion)
Builds an analyzer which removes words inENGLISH_STOP_WORDS_SET
.- Parameters:
matchVersion
- See above
-
StopAnalyzer
public StopAnalyzer(Version matchVersion, CharArraySet stopWords)
Builds an analyzer with the stop words from the given set.- Parameters:
matchVersion
- See abovestopWords
- Set of stop words
-
StopAnalyzer
public StopAnalyzer(Version matchVersion, java.io.File stopwordsFile) throws java.io.IOException
Builds an analyzer with the stop words from the given file.- Parameters:
matchVersion
- See abovestopwordsFile
- File to load stop words from- Throws:
java.io.IOException
- See Also:
WordlistLoader.getWordSet(Reader, Version)
-
StopAnalyzer
public StopAnalyzer(Version matchVersion, java.io.Reader stopwords) throws java.io.IOException
Builds an analyzer with the stop words from the given reader.- Parameters:
matchVersion
- See abovestopwords
- Reader to load stop words from- Throws:
java.io.IOException
- See Also:
WordlistLoader.getWordSet(Reader, Version)
-
-