Class CommonTermsQuery
- All Implemented Interfaces:
Cloneable
added
terms: low-frequency
terms are added to a required boolean clause and high-frequency terms are
added to an optional boolean clause. The optional clause is only executed if
the required "low-frequency" clause matches. Scores produced by this query
will be slightly different than plain BooleanQuery
scorer mainly due to
differences in the number of leaf queries
in the required boolean clause. In most cases, high-frequency terms are
unlikely to significantly contribute to the document score unless at least
one of the low-frequency terms are matched. This query can improve
query execution times significantly if applicable.
CommonTermsQuery
has several advantages over stopword filtering at
index or query time since a term can be "classified" based on the actual
document frequency in the index and can prevent slow queries even across
domains without specialized stopword files.
Note: if the query only contains high-frequency terms the query is rewritten into a plain conjunction query ie. all high-frequency terms need to match in order to match a document.
-
Constructor Summary
ConstructorsConstructorDescriptionCommonTermsQuery
(BooleanClause.Occur highFreqOccur, BooleanClause.Occur lowFreqOccur, float maxTermFrequency) Creates a newCommonTermsQuery
CommonTermsQuery
(BooleanClause.Occur highFreqOccur, BooleanClause.Occur lowFreqOccur, float maxTermFrequency, boolean disableCoord) Creates a newCommonTermsQuery
-
Method Summary
Modifier and TypeMethodDescriptionvoid
Adds a term to theCommonTermsQuery
void
collectTermContext
(IndexReader reader, List<AtomicReaderContext> leaves, TermContext[] contextArray, Term[] queryTerms) boolean
void
extractTerms
(Set<Term> terms) Expert: adds all terms occurring in this query to the terms set.float
Gets the minimum number of the optional high frequent BooleanClauses which must be satisfied.float
Gets the minimum number of the optional low frequent BooleanClauses which must be satisfied.int
hashCode()
boolean
Returns true iffSimilarity.coord(int,int)
is disabled in scoring for the high and low frequency query instance.rewrite
(IndexReader reader) Expert: called to re-write queries into primitive queries.void
setHighFreqMinimumNumberShouldMatch
(float min) Specifies a minimum number of the high frequent optional BooleanClauses which must be satisfied in order to produce a match on the low frequency terms query part.void
setLowFreqMinimumNumberShouldMatch
(float min) Specifies a minimum number of the low frequent optional BooleanClauses which must be satisfied in order to produce a match on the low frequency terms query part.Prints a query to a string, withfield
assumed to be the default field and omitted.
-
Constructor Details
-
CommonTermsQuery
public CommonTermsQuery(BooleanClause.Occur highFreqOccur, BooleanClause.Occur lowFreqOccur, float maxTermFrequency) Creates a newCommonTermsQuery
- Parameters:
highFreqOccur
-BooleanClause.Occur
used for high frequency termslowFreqOccur
-BooleanClause.Occur
used for low frequency termsmaxTermFrequency
- a value in [0..1) (or absolute number >=1) representing the maximum threshold of a terms document frequency to be considered a low frequency term.- Throws:
IllegalArgumentException
- ifBooleanClause.Occur.MUST_NOT
is pass as lowFreqOccur or highFreqOccur
-
CommonTermsQuery
public CommonTermsQuery(BooleanClause.Occur highFreqOccur, BooleanClause.Occur lowFreqOccur, float maxTermFrequency, boolean disableCoord) Creates a newCommonTermsQuery
- Parameters:
highFreqOccur
-BooleanClause.Occur
used for high frequency termslowFreqOccur
-BooleanClause.Occur
used for low frequency termsmaxTermFrequency
- a value in [0..1) (or absolute number >=1) representing the maximum threshold of a terms document frequency to be considered a low frequency term.disableCoord
- disablesSimilarity.coord(int,int)
in scoring for the low / high frequency sub-queries- Throws:
IllegalArgumentException
- ifBooleanClause.Occur.MUST_NOT
is pass as lowFreqOccur or highFreqOccur
-
-
Method Details
-
add
Adds a term to theCommonTermsQuery
- Parameters:
term
- the term to add
-
rewrite
Description copied from class:Query
Expert: called to re-write queries into primitive queries. For example, a PrefixQuery will be rewritten into a BooleanQuery that consists of TermQuerys.- Overrides:
rewrite
in classQuery
- Throws:
IOException
-
collectTermContext
public void collectTermContext(IndexReader reader, List<AtomicReaderContext> leaves, TermContext[] contextArray, Term[] queryTerms) throws IOException - Throws:
IOException
-
isCoordDisabled
public boolean isCoordDisabled()Returns true iffSimilarity.coord(int,int)
is disabled in scoring for the high and low frequency query instance. The top level query will always disable coords. -
setLowFreqMinimumNumberShouldMatch
public void setLowFreqMinimumNumberShouldMatch(float min) Specifies a minimum number of the low frequent optional BooleanClauses which must be satisfied in order to produce a match on the low frequency terms query part. This method accepts a float value in the range [0..1) as a fraction of the actual query terms in the low frequent clause or a number >=1 as an absolut number of clauses that need to match.By default no optional clauses are necessary for a match (unless there are no required clauses). If this method is used, then the specified number of clauses is required.
- Parameters:
min
- the number of optional clauses that must match
-
getLowFreqMinimumNumberShouldMatch
public float getLowFreqMinimumNumberShouldMatch()Gets the minimum number of the optional low frequent BooleanClauses which must be satisfied. -
setHighFreqMinimumNumberShouldMatch
public void setHighFreqMinimumNumberShouldMatch(float min) Specifies a minimum number of the high frequent optional BooleanClauses which must be satisfied in order to produce a match on the low frequency terms query part. This method accepts a float value in the range [0..1) as a fraction of the actual query terms in the low frequent clause or a number >=1 as an absolut number of clauses that need to match.By default no optional clauses are necessary for a match (unless there are no required clauses). If this method is used, then the specified number of clauses is required.
- Parameters:
min
- the number of optional clauses that must match
-
getHighFreqMinimumNumberShouldMatch
public float getHighFreqMinimumNumberShouldMatch()Gets the minimum number of the optional high frequent BooleanClauses which must be satisfied. -
extractTerms
Description copied from class:Query
Expert: adds all terms occurring in this query to the terms set. Only works if this query is in itsrewritten
form.- Overrides:
extractTerms
in classQuery
-
toString
Description copied from class:Query
Prints a query to a string, withfield
assumed to be the default field and omitted. -
hashCode
public int hashCode() -
equals
-