Class RewriterFeatures


  • public class RewriterFeatures
    extends Object
    Contains commonly used rewriter features
    Author:
    Karen Sze Wing Lee
    • Constructor Detail

      • RewriterFeatures

        public RewriterFeatures()
    • Method Detail

      • addUnitToOriginalQuery

        public static Query addUnitToOriginalQuery​(Query query,
                                                   String boostingQuery,
                                                   boolean keepOriginalQuery)
                                            throws RuntimeException

        Add proximity boosting to original query by modifying the query tree directly

        e.g. original Query Tree: (AND aa bb)
        if keepOriginalQuery: true
        new Query tree: (OR (AND aa bb) "aa bb")
        if keepOriginalQuery: false
        new Query Tree: "aa bb"

        original Query Tree: (OR (AND aa bb) (AND cc dd))
        boostingQuery: cc dd
        if keepOriginalQuery: true
        new Query Tree: (OR (AND aa bb) (AND cc dd) "cc dd")
        if keepOriginalQuery: false
        new Query Tree: (OR (AND aa bb) "cc dd")
        Parameters:
        query - Query object from searcher
        boostingQuery - query to be boosted
        keepOriginalQuery - whether to keep original unboosted query as equiv
        Returns:
        Modified Query object, return original query object on error
        Throws:
        RuntimeException
      • addRewritesAsEquiv

        public static Query addRewritesAsEquiv​(Query query,
                                               String matchingStr,
                                               String rewrites,
                                               boolean addUnitToRewrites,
                                               int maxNumRewrites)
                                        throws RuntimeException

        Add query expansion to the query tree

        e.g. origQuery: aa bb
        matchingStr: aa bb
        rewrite: cc dd, ee ff
        if addUnitToRewrites: false
        new query tree: (OR (AND aa bb) (AND cc dd) (AND ee ff))
        if addUnitToRewrites: true
        new query tree: (OR (AND aa bb) "cc dd" "ee ff")
        Parameters:
        query - Query object from searcher
        matchingStr - string used to retrieve the rewrite
        rewrites - The rewrite string retrieved from dictionary
        addUnitToRewrites - Whether to add unit to rewrites
        maxNumRewrites - Max number of rewrites to be added, 0 if no limit
        Returns:
        Modified Query object, return original query object on error
        Throws:
        RuntimeException
      • getNonOverlappingFullPhraseMatches

        public static Set<PhraseMatcher.Phrase> getNonOverlappingFullPhraseMatches​(PhraseMatcher phraseMatcher,
                                                                                   Query query)
                                                                            throws RuntimeException

        Retrieve the longest, from left to right non overlapping full phrase substrings in query based on FSA dictionary

        e.g. query: ((modern AND new AND york AND city AND travel) OR travel) AND ((sunny AND travel AND agency) OR nyc)
        dictionary:
        mny\tmodern new york
        mo\tmodern
        modern\tn/a
        modern\tnew york\tn/a
        new york\tn/a
        new york city\tn/a
        new york city travel\tn/a
        new york company\tn/a
        ny\tnew york
        nyc\tnew york city\tnew york company
        nyct\tnew york city travel
        ta\ttravel agency
        travel agency\tn/a
        return: nyc
        Parameters:
        phraseMatcher - PhraseMatcher object loaded with FSA dict
        query - Query object from the searcher
        Returns:
        Matching phrases
        Throws:
        RuntimeException
      • getNonOverlappingPartialPhraseMatches

        public static Set<PhraseMatcher.Phrase> getNonOverlappingPartialPhraseMatches​(PhraseMatcher phraseMatcher,
                                                                                      Query query)
                                                                               throws RuntimeException

        Retrieve the longest, from left to right non overlapping partial phrase substrings in query based on FSA dictionary

        e.g. query: ((modern AND new AND york AND city AND travel) OR travel) AND ((sunny AND travel AND agency) OR nyc)
        dictionary:
        mny\tmodern new york
        mo\tmodern
        modern\tn/a
        modern new york\tn/a
        new york\tn/a
        new york city\tn/a
        new york city travel\tn/a
        new york company\tn/a
        ny\tnew york
        nyc\tnew york city\tnew york company
        nyct\tnew york city travel
        ta\ttravel agency
        travel agency\tn/a
        return:
        modern
        new york city travel
        travel agency
        nyc
        Parameters:
        phraseMatcher - PhraseMatcher object loaded with FSA dict
        query - Query object from the searcher
        Returns:
        Matching phrases
        Throws:
        RuntimeException
      • getNonOverlappingMatchesInAndItem

        public static List<PhraseMatcher.Phrase> getNonOverlappingMatchesInAndItem​(List<PhraseMatcher.Phrase> allMatches,
                                                                                   Query query)
                                                                            throws RuntimeException

        Retrieve the longest, from left to right non overlapping substrings in AndItem based on FSA dictionary

        e.g. subtree: (modern AND new AND york AND city AND travel)
        dictionary:
        mny\tmodern new york
        mo\tmodern
        modern\tn/a
        modern new york\tn/a
        new york\tn/a
        new york city\tn/a
        new york city travel\tn/a
        new york company\tn/a
        ny\tnew york
        nyc\tnew york city\tnew york company
        nyct\tnew york city travel
        allMatches:
        modern
        modern new york
        new york
        new york city
        new york city travel
        return:
        modern
        new york city travel
        Parameters:
        allMatches - All matches within the subtree
        query - Query object from the searcher
        Returns:
        Matching phrases
        Throws:
        RuntimeException
      • addExpansions

        public static Query addExpansions​(Query query,
                                          Set<PhraseMatcher.Phrase> matches,
                                          String expandIndex,
                                          int maxNumRewrites,
                                          boolean removeOriginal,
                                          boolean addUnitToRewrites)
                                   throws RuntimeException

        Add Expansions to the matching phrases

        e.g. Query: nyc travel agency
        matching phrase: nyc\tnew york city\tnew york company travel agency\tn/a
        if expandIndex is not null and removeOriginal is true
        New Query: ((new york city) OR ([expandIndex]:new york city) OR (new york company) OR ([expandIndex]:new york company)) AND ((travel agency) OR ([expandIndex]:travel agency))
        if expandIndex is null and removeOriginal is true
        New Query: ((new york city) OR (new york company)) AND travel agency
        if expandIndex is null and removeOriginal is false
        New Query: (nyc OR (new york city) OR (new york company)) AND travel agency
        Parameters:
        query - Query object from searcher
        matches - Set of longest non-overlapping matches
        expandIndex - Name of expansion index or null if default index
        maxNumRewrites - Max number of rewrites to be added, 0 if no limit
        removeOriginal - Whether to remove the original matching phrase
        addUnitToRewrites - Whether to add rewrite as phrase
        Throws:
        RuntimeException
      • convertMatchToString

        public static String convertMatchToString​(PhraseMatcher.Phrase phrase)
        Convert Match to String
        Parameters:
        phrase - Match from PhraseMatcher
        Returns:
        String format of the phrase