Package com.yahoo.search.query.rewrite
Class RewriterFeatures
- java.lang.Object
-
- com.yahoo.search.query.rewrite.RewriterFeatures
-
public class RewriterFeatures extends java.lang.Object
Contains commonly used rewriter features- Author:
- Karen Sze Wing Lee
-
-
Constructor Summary
Constructors Constructor Description RewriterFeatures()
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description static Query
addExpansions(Query query, java.util.Set<PhraseMatcher.Phrase> matches, java.lang.String expandIndex, int maxNumRewrites, boolean removeOriginal, boolean addUnitToRewrites)
Add Expansions to the matching phrasesstatic Query
addRewritesAsEquiv(Query query, java.lang.String matchingStr, java.lang.String rewrites, boolean addUnitToRewrites, int maxNumRewrites)
Add query expansion to the query treestatic Query
addUnitToOriginalQuery(Query query, java.lang.String boostingQuery, boolean keepOriginalQuery)
Add proximity boosting to original query by modifying the query tree directlystatic java.lang.String
convertMatchToString(PhraseMatcher.Phrase phrase)
Convert Match to Stringstatic java.util.Set<PhraseMatcher.Phrase>
getNonOverlappingFullPhraseMatches(PhraseMatcher phraseMatcher, Query query)
Retrieve the longest, from left to right non overlapping full phrase substrings in query based on FSA dictionarystatic java.util.List<PhraseMatcher.Phrase>
getNonOverlappingMatchesInAndItem(java.util.List<PhraseMatcher.Phrase> allMatches, Query query)
Retrieve the longest, from left to right non overlapping substrings in AndItem based on FSA dictionarystatic java.util.Set<PhraseMatcher.Phrase>
getNonOverlappingPartialPhraseMatches(PhraseMatcher phraseMatcher, Query query)
Retrieve the longest, from left to right non overlapping partial phrase substrings in query based on FSA dictionary
-
-
-
Method Detail
-
addUnitToOriginalQuery
public static Query addUnitToOriginalQuery(Query query, java.lang.String boostingQuery, boolean keepOriginalQuery) throws java.lang.RuntimeException
Add proximity boosting to original query by modifying the query tree directly
e.g. original Query Tree: (AND aa bb)
if keepOriginalQuery: true
new Query tree: (OR (AND aa bb) "aa bb")
if keepOriginalQuery: false
new Query Tree: "aa bb"
original Query Tree: (OR (AND aa bb) (AND cc dd))
boostingQuery: cc dd
if keepOriginalQuery: true
new Query Tree: (OR (AND aa bb) (AND cc dd) "cc dd")
if keepOriginalQuery: false
new Query Tree: (OR (AND aa bb) "cc dd")- Parameters:
query
- Query object from searcherboostingQuery
- query to be boostedkeepOriginalQuery
- whether to keep original unboosted query as equiv- Returns:
- Modified Query object, return original query object on error
- Throws:
java.lang.RuntimeException
-
addRewritesAsEquiv
public static Query addRewritesAsEquiv(Query query, java.lang.String matchingStr, java.lang.String rewrites, boolean addUnitToRewrites, int maxNumRewrites) throws java.lang.RuntimeException
Add query expansion to the query tree
e.g. origQuery: aa bb
matchingStr: aa bb
rewrite: cc dd, ee ff
if addUnitToRewrites: false
new query tree: (OR (AND aa bb) (AND cc dd) (AND ee ff))
if addUnitToRewrites: true
new query tree: (OR (AND aa bb) "cc dd" "ee ff")- Parameters:
query
- Query object from searchermatchingStr
- string used to retrieve the rewriterewrites
- The rewrite string retrieved from dictionaryaddUnitToRewrites
- Whether to add unit to rewritesmaxNumRewrites
- Max number of rewrites to be added, 0 if no limit- Returns:
- Modified Query object, return original query object on error
- Throws:
java.lang.RuntimeException
-
getNonOverlappingFullPhraseMatches
public static java.util.Set<PhraseMatcher.Phrase> getNonOverlappingFullPhraseMatches(PhraseMatcher phraseMatcher, Query query) throws java.lang.RuntimeException
Retrieve the longest, from left to right non overlapping full phrase substrings in query based on FSA dictionary
e.g. query: ((modern AND new AND york AND city AND travel) OR travel) AND ((sunny AND travel AND agency) OR nyc)
dictionary:
mny\tmodern new york
mo\tmodern
modern\tn/a
modern\tnew york\tn/a
new york\tn/a
new york city\tn/a
new york city travel\tn/a
new york company\tn/a
ny\tnew york
nyc\tnew york city\tnew york company
nyct\tnew york city travel
ta\ttravel agency
travel agency\tn/a
return: nyc- Parameters:
phraseMatcher
- PhraseMatcher object loaded with FSA dictquery
- Query object from the searcher- Returns:
- Matching phrases
- Throws:
java.lang.RuntimeException
-
getNonOverlappingPartialPhraseMatches
public static java.util.Set<PhraseMatcher.Phrase> getNonOverlappingPartialPhraseMatches(PhraseMatcher phraseMatcher, Query query) throws java.lang.RuntimeException
Retrieve the longest, from left to right non overlapping partial phrase substrings in query based on FSA dictionary
e.g. query: ((modern AND new AND york AND city AND travel) OR travel) AND ((sunny AND travel AND agency) OR nyc)
dictionary:
mny\tmodern new york
mo\tmodern
modern\tn/a
modern new york\tn/a
new york\tn/a
new york city\tn/a
new york city travel\tn/a
new york company\tn/a
ny\tnew york
nyc\tnew york city\tnew york company
nyct\tnew york city travel
ta\ttravel agency
travel agency\tn/a
return:
modern
new york city travel
travel agency
nyc- Parameters:
phraseMatcher
- PhraseMatcher object loaded with FSA dictquery
- Query object from the searcher- Returns:
- Matching phrases
- Throws:
java.lang.RuntimeException
-
getNonOverlappingMatchesInAndItem
public static java.util.List<PhraseMatcher.Phrase> getNonOverlappingMatchesInAndItem(java.util.List<PhraseMatcher.Phrase> allMatches, Query query) throws java.lang.RuntimeException
Retrieve the longest, from left to right non overlapping substrings in AndItem based on FSA dictionary
e.g. subtree: (modern AND new AND york AND city AND travel)
dictionary:
mny\tmodern new york
mo\tmodern
modern\tn/a
modern new york\tn/a
new york\tn/a
new york city\tn/a
new york city travel\tn/a
new york company\tn/a
ny\tnew york
nyc\tnew york city\tnew york company
nyct\tnew york city travel
allMatches:
modern
modern new york
new york
new york city
new york city travel
return:
modern
new york city travel- Parameters:
allMatches
- All matches within the subtreequery
- Query object from the searcher- Returns:
- Matching phrases
- Throws:
java.lang.RuntimeException
-
addExpansions
public static Query addExpansions(Query query, java.util.Set<PhraseMatcher.Phrase> matches, java.lang.String expandIndex, int maxNumRewrites, boolean removeOriginal, boolean addUnitToRewrites) throws java.lang.RuntimeException
Add Expansions to the matching phrases
e.g. Query: nyc travel agency
matching phrase: nyc\tnew york city\tnew york company travel agency\tn/a
if expandIndex is not null and removeOriginal is true
New Query: ((new york city) OR ([expandIndex]:new york city) OR (new york company) OR ([expandIndex]:new york company)) AND ((travel agency) OR ([expandIndex]:travel agency))
if expandIndex is null and removeOriginal is true
New Query: ((new york city) OR (new york company)) AND travel agency
if expandIndex is null and removeOriginal is false
New Query: (nyc OR (new york city) OR (new york company)) AND travel agency- Parameters:
query
- Query object from searchermatches
- Set of longest non-overlapping matchesexpandIndex
- Name of expansion index or null if default indexmaxNumRewrites
- Max number of rewrites to be added, 0 if no limitremoveOriginal
- Whether to remove the original matching phraseaddUnitToRewrites
- Whether to add rewrite as phrase- Throws:
java.lang.RuntimeException
-
convertMatchToString
public static java.lang.String convertMatchToString(PhraseMatcher.Phrase phrase)
Convert Match to String- Parameters:
phrase
- Match from PhraseMatcher- Returns:
- String format of the phrase
-
-