Package org.apache.lucene.analysis.core
Class StopFilterFactory
- java.lang.Object
-
- org.apache.lucene.analysis.util.AbstractAnalysisFactory
-
- org.apache.lucene.analysis.util.TokenFilterFactory
-
- org.apache.lucene.analysis.core.StopFilterFactory
-
- All Implemented Interfaces:
ResourceLoaderAware
public class StopFilterFactory extends TokenFilterFactory implements ResourceLoaderAware
Factory forStopFilter
.<fieldType name="text_stop" class="solr.TextField" positionIncrementGap="100" autoGeneratePhraseQueries="true"> <analyzer> <tokenizer class="solr.WhitespaceTokenizerFactory"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" format="wordset" /> </analyzer> </fieldType>
All attributes are optional:
ignoreCase
defaults tofalse
words
should be the name of a stopwords file to parse, if not specified the factory will useStopAnalyzer.ENGLISH_STOP_WORDS_SET
format
defines how thewords
file will be parsed, and defaults towordset
. Ifwords
is not specified, thenformat
must not be specified.
The valid values for the
format
option are:wordset
- This is the default format, which supports one word per line (including any intra-word whitespace) and allows whole line comments begining with the "#" character. Blank lines are ignored. SeeWordlistLoader.getLines
for details.snowball
- This format allows for multiple words specified on each line, and trailing comments may be specified using the vertical line ("|"). Blank lines are ignored. SeeWordlistLoader.getSnowballWordSet
for details.
-
-
Field Summary
Fields Modifier and Type Field Description static String
FORMAT_SNOWBALL
static String
FORMAT_WORDSET
-
Fields inherited from class org.apache.lucene.analysis.util.AbstractAnalysisFactory
LUCENE_MATCH_VERSION_PARAM
-
-
Constructor Summary
Constructors Constructor Description StopFilterFactory(Map<String,String> args)
Creates a new StopFilterFactory
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description TokenStream
create(TokenStream input)
Transform the specified input TokenStreamCharArraySet
getStopWords()
void
inform(ResourceLoader loader)
Initializes this component with the provided ResourceLoader (used for loading classes, files, etc).boolean
isEnablePositionIncrements()
boolean
isIgnoreCase()
-
Methods inherited from class org.apache.lucene.analysis.util.TokenFilterFactory
availableTokenFilters, forName, lookupClass, reloadTokenFilters
-
Methods inherited from class org.apache.lucene.analysis.util.AbstractAnalysisFactory
get, get, get, get, get, getChar, getClassArg, getLuceneMatchVersion, getOriginalArgs, getSet, isExplicitLuceneMatchVersion, require, require, require, requireChar, setExplicitLuceneMatchVersion
-
-
-
-
Field Detail
-
FORMAT_WORDSET
public static final String FORMAT_WORDSET
- See Also:
- Constant Field Values
-
FORMAT_SNOWBALL
public static final String FORMAT_SNOWBALL
- See Also:
- Constant Field Values
-
-
Method Detail
-
inform
public void inform(ResourceLoader loader) throws IOException
Description copied from interface:ResourceLoaderAware
Initializes this component with the provided ResourceLoader (used for loading classes, files, etc).- Specified by:
inform
in interfaceResourceLoaderAware
- Throws:
IOException
-
isEnablePositionIncrements
public boolean isEnablePositionIncrements()
-
isIgnoreCase
public boolean isIgnoreCase()
-
getStopWords
public CharArraySet getStopWords()
-
create
public TokenStream create(TokenStream input)
Description copied from class:TokenFilterFactory
Transform the specified input TokenStream- Specified by:
create
in classTokenFilterFactory
-
-