public class SimpleQueryParser extends QueryBuilder
The main idea behind this parser is that a person should be able to type whatever they want to represent a query, and this parser will do its best to interpret what to search for no matter how poorly composed the request may be. Tokens are considered to be any of a term, phrase, or subquery for the operations described below. Whitespace including ' ' '\n' '\r' and '\t' and certain operators may be used to delimit tokens ( ) + | " .
Any errors in query syntax will be ignored and the parser will attempt to decipher what it can; however, this may mean odd or unexpected results.
+
' specifies AND
operation: token1+token2
|
' specifies OR
operation: token1|token2
-
' negates a single token: -token0
"
' creates phrases of terms: "term1 term2 ..."
*
' at the end of terms specifies prefix query: term*
~
N' at the end of terms specifies fuzzy query: term~1
~
N' at the end of phrases specifies near query: "term1 term2"~5
(
' and ')
' specifies precedence: token1 + (token2 | token3)
The default operator
is OR
if no other operator is specified.
For example, the following will OR
token1
and token2
together:
token1 token2
Normal operator precedence will be simple order from right to left.
For example, the following will evaluate token1 OR token2
first,
then AND
with token3
:
token1 | token2 + token3
An individual term may contain any possible character with certain characters
requiring escaping using a '\
'. The following characters will need to be escaped in
terms and phrases:
+ | " ( ) ' \
The '-
' operator is a special case. On individual terms (not phrases) the first
character of a term that is -
must be escaped; however, any '-
' characters
beyond the first character do not need to be escaped.
For example:
-term1
-- Specifies NOT
operation against term1
\-term1
-- Searches for the term -term1
.
term-1
-- Searches for the term term-1
.
term\-1
-- Searches for the term term-1
.
The '*
' operator is a special case. On individual terms (not phrases) the last
character of a term that is '*
' must be escaped; however, any '*
' characters
before the last character do not need to be escaped:
term1*
-- Searches for the prefix term1
term1\*
-- Searches for the term term1*
term*1
-- Searches for the term term*1
term\*1
-- Searches for the term term*1
Note that above examples consider the terms before text processing.
Modifier and Type | Field and Description |
---|---|
static int |
AND_OPERATOR
Enables
AND operator (+) |
static int |
ESCAPE_OPERATOR
Enables
ESCAPE operator (\) |
static int |
FUZZY_OPERATOR
Enables
FUZZY operators: (~) on single terms |
static int |
NEAR_OPERATOR
Enables
NEAR operators: (~) on phrases |
static int |
NOT_OPERATOR
Enables
NOT operator (-) |
static int |
OR_OPERATOR
Enables
OR operator (|) |
static int |
PHRASE_OPERATOR
Enables
PHRASE operator (") |
static int |
PRECEDENCE_OPERATORS
Enables
PRECEDENCE operators: ( and ) |
static int |
PREFIX_OPERATOR
Enables
PREFIX operator (*) |
static int |
WHITESPACE_OPERATOR
Enables
WHITESPACE operators: ' ' '\n' '\r' '\t' |
Constructor and Description |
---|
SimpleQueryParser(Analyzer analyzer,
Map<String,Float> weights)
Creates a new parser searching over multiple fields with different weights.
|
SimpleQueryParser(Analyzer analyzer,
Map<String,Float> weights,
int flags)
Creates a new parser with custom flags used to enable/disable certain features.
|
SimpleQueryParser(Analyzer analyzer,
String field)
Creates a new parser searching over a single field.
|
Modifier and Type | Method and Description |
---|---|
BooleanClause.Occur |
getDefaultOperator()
Returns the implicit operator setting, which will be
either
SHOULD or MUST . |
Query |
parse(String queryText)
Parses the query text and returns parsed query (or null if empty)
|
void |
setDefaultOperator(BooleanClause.Occur operator)
Sets the implicit operator setting, which must be
either
SHOULD or MUST . |
createBooleanQuery, createBooleanQuery, createMinShouldMatchQuery, createPhraseQuery, createPhraseQuery, getAnalyzer, getEnablePositionIncrements, setAnalyzer, setEnablePositionIncrements
public static final int AND_OPERATOR
AND
operator (+)public static final int NOT_OPERATOR
NOT
operator (-)public static final int OR_OPERATOR
OR
operator (|)public static final int PREFIX_OPERATOR
PREFIX
operator (*)public static final int PHRASE_OPERATOR
PHRASE
operator (")public static final int PRECEDENCE_OPERATORS
PRECEDENCE
operators: (
and )
public static final int ESCAPE_OPERATOR
ESCAPE
operator (\)public static final int WHITESPACE_OPERATOR
WHITESPACE
operators: ' ' '\n' '\r' '\t'public static final int FUZZY_OPERATOR
FUZZY
operators: (~) on single termspublic static final int NEAR_OPERATOR
NEAR
operators: (~) on phrasespublic SimpleQueryParser(Analyzer analyzer, String field)
public SimpleQueryParser(Analyzer analyzer, Map<String,Float> weights)
public Query parse(String queryText)
public BooleanClause.Occur getDefaultOperator()
SHOULD
or MUST
.public void setDefaultOperator(BooleanClause.Occur operator)
SHOULD
or MUST
.Copyright © 2010 - 2020 Adobe. All Rights Reserved