Package org.apache.lucene.analysis.core
Class StopFilter
- java.lang.Object
-
- org.apache.lucene.util.AttributeSource
-
- org.apache.lucene.analysis.TokenStream
-
- org.apache.lucene.analysis.TokenFilter
-
- org.apache.lucene.analysis.util.FilteringTokenFilter
-
- org.apache.lucene.analysis.core.StopFilter
-
- All Implemented Interfaces:
java.io.Closeable
,java.lang.AutoCloseable
public final class StopFilter extends FilteringTokenFilter
Removes stop words from a token stream.You must specify the required
Version
compatibility when creating StopFilter:- As of 3.1, StopFilter correctly handles Unicode 4.0 supplementary characters in stopwords and position increments are preserved
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from class org.apache.lucene.util.AttributeSource
AttributeSource.AttributeFactory, AttributeSource.State
-
-
Constructor Summary
Constructors Constructor Description StopFilter(Version matchVersion, TokenStream in, CharArraySet stopWords)
Constructs a filter which removes words from the input TokenStream that are named in the Set.
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description static CharArraySet
makeStopSet(Version matchVersion, java.lang.String... stopWords)
Builds a Set from an array of stop words, appropriate for passing into the StopFilter constructor.static CharArraySet
makeStopSet(Version matchVersion, java.lang.String[] stopWords, boolean ignoreCase)
Creates a stopword set from the given stopword array.static CharArraySet
makeStopSet(Version matchVersion, java.util.List<?> stopWords)
Builds a Set from an array of stop words, appropriate for passing into the StopFilter constructor.static CharArraySet
makeStopSet(Version matchVersion, java.util.List<?> stopWords, boolean ignoreCase)
Creates a stopword set from the given stopword list.-
Methods inherited from class org.apache.lucene.analysis.util.FilteringTokenFilter
end, getEnablePositionIncrements, incrementToken, reset, setEnablePositionIncrements
-
Methods inherited from class org.apache.lucene.analysis.TokenFilter
close
-
Methods inherited from class org.apache.lucene.util.AttributeSource
addAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, copyTo, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, reflectAsString, reflectWith, restoreState, toString
-
-
-
-
Constructor Detail
-
StopFilter
public StopFilter(Version matchVersion, TokenStream in, CharArraySet stopWords)
Constructs a filter which removes words from the input TokenStream that are named in the Set.- Parameters:
matchVersion
- Lucene version to enable correct Unicode 4.0 behavior in the stop set if Version > 3.0. See above for details.in
- Input streamstopWords
- ACharArraySet
representing the stopwords.- See Also:
makeStopSet(Version, java.lang.String...)
-
-
Method Detail
-
makeStopSet
public static CharArraySet makeStopSet(Version matchVersion, java.lang.String... stopWords)
Builds a Set from an array of stop words, appropriate for passing into the StopFilter constructor. This permits this stopWords construction to be cached once when an Analyzer is constructed.- Parameters:
matchVersion
- Lucene version to enable correct Unicode 4.0 behavior in the returned set if Version > 3.0stopWords
- An array of stopwords- See Also:
passing false to ignoreCase
-
makeStopSet
public static CharArraySet makeStopSet(Version matchVersion, java.util.List<?> stopWords)
Builds a Set from an array of stop words, appropriate for passing into the StopFilter constructor. This permits this stopWords construction to be cached once when an Analyzer is constructed.- Parameters:
matchVersion
- Lucene version to enable correct Unicode 4.0 behavior in the returned set if Version > 3.0stopWords
- A List of Strings or char[] or any other toString()-able list representing the stopwords- Returns:
- A Set (
CharArraySet
) containing the words - See Also:
passing false to ignoreCase
-
makeStopSet
public static CharArraySet makeStopSet(Version matchVersion, java.lang.String[] stopWords, boolean ignoreCase)
Creates a stopword set from the given stopword array.- Parameters:
matchVersion
- Lucene version to enable correct Unicode 4.0 behavior in the returned set if Version > 3.0stopWords
- An array of stopwordsignoreCase
- If true, all words are lower cased first.- Returns:
- a Set containing the words
-
makeStopSet
public static CharArraySet makeStopSet(Version matchVersion, java.util.List<?> stopWords, boolean ignoreCase)
Creates a stopword set from the given stopword list.- Parameters:
matchVersion
- Lucene version to enable correct Unicode 4.0 behavior in the returned set if Version > 3.0stopWords
- A List of Strings or char[] or any other toString()-able list representing the stopwordsignoreCase
- if true, all words are lower cased first- Returns:
- A Set (
CharArraySet
) containing the words
-
-