Class PhraseMatcher


  • public class PhraseMatcher
    extends Object
    Detects query phrases using an automaton. This class is thread safe.
    Author:
    bratseth
    • Constructor Detail

      • PhraseMatcher

        public PhraseMatcher​(String phraseAutomatonFile)
        Creates a phrase matcher. This will not ignore plural/singular form differences when matching
        Parameters:
        phraseAutomatonFile - the file containing phrases to match
        Throws:
        IllegalArgumentException - if the file is not found
      • PhraseMatcher

        public PhraseMatcher​(String phraseAutomatonFile,
                             boolean ignorePluralForm)
        Creates a phrase matcher
        Parameters:
        phraseAutomatonFile - the file containing phrases to match
        ignorePluralForm - whether we should ignore plural and singular forms as matches
        Throws:
        IllegalArgumentException - if the file is not found
      • PhraseMatcher

        public PhraseMatcher​(com.yahoo.fsa.FSA phraseAutomatonFSA,
                             boolean ignorePluralForm)
        Creates a phrase matcher
        Parameters:
        phraseAutomatonFSA - the fsa containing phrases to match
        ignorePluralForm - whether we should ignore plural and singular forms as matches
        Throws:
        IllegalArgumentException - if FSA is null
    • Method Detail

      • isEmpty

        public boolean isEmpty()
      • setMatchPhraseItems

        public void setMatchPhraseItems​(boolean matchPhraseItems)
        Set whether to match words contained in phrase items as well. Default is false - don't match words contained in phrase items
      • setMatchSingleItems

        public void setMatchSingleItems​(boolean matchSingleItems)
        Sets whether single items should be matched and returned as phrase matches. Default is false.
      • setIgnorePluralForm

        public void setIgnorePluralForm​(boolean ignorePluralForm)
        Sets whether we should ignore plural/singular form when matching
      • setMatchAll

        public void setMatchAll​(boolean matchAll)
        Sets whether to return the longest matching phrase when there are overlapping matches (default), or all matching phrases
      • matchPhrases

        public List<PhraseMatcher.Phrase> matchPhrases​(Item queryItem)
        Finds all phrases (word sequences of length 1 or higher) of the same index, not negative items of a notitem, which constitutes a complete entry in the automaton of this matcher
        Parameters:
        queryItem - the root query item in which to match phrases
        Returns:
        the matched phrases, or null if there was no matches
      • getNullMatcher

        public static PhraseMatcher getNullMatcher()
        Returns a phrase matcher which (quickly) never matches anything