A map of tries to be matched for for case insensitive KBs
A map of tries to be matched for for case sensitive KBs
Labels matching all of the kbs and overrideKBs used in the matchers. They should be in the order that the kbs were specified and continue in the order that any additional labels are encountered in overrideKBs.
Set of single-token entity names that can be spelled using lower case, according to the KB(s)
If true, tokens are matched using lemmas, otherwise using words
An object able to validate any matches that are found
A map of tries to be matched for for case insensitive KBs
A map of tries to be matched for for case sensitive KBs
An object able to validate any matches that are found
The class is serializable and this method is used during testing to determine whether a reconstitued object is equal to the original without interfering with the operation of equals and getting into hash codes.
The class is serializable and this method is used during testing to determine whether a reconstitued object is equal to the original without interfering with the operation of equals and getting into hash codes. Is is not necessary for this operation to be efficient or complete.
The object to compare to
Whether this and other are equal, at least as far is serialization is concerned
Matches the lexicons against this sentence
Matches the lexicons against this sentence
The input sentence
An array of BIO notations the store the outcome of the matches
Finds the longest match across all matchers.
Finds the longest match across all matchers. This means that the longest match is always chosen, even if coming from a matcher with lower priority Only ties are disambiguated according to the order provided in the constructor
Words known to appear with and without capitalized letters which help determine whether a span of text is contentful
Words known to appear with and without capitalized letters which help determine whether a span of text is contentful
Labels matching all of the kbs and overrideKBs used in the matchers.
Labels matching all of the kbs and overrideKBs used in the matchers. They should be in the order that the kbs were specified and continue in the order that any additional labels are encountered in overrideKBs.
If false, use the words of a sentence; if true, the lemmas
If false, use the words of a sentence; if true, the lemmas
Lexicon-based NER which efficiently recognizes entities from large dictionaries by combining like matchers
Case insensitive matching is performed by one matcher and case sensitive by the other. Each can account for multiple KBs. Each IntHashTrie stores Ints which indicate which of the KBs an entry comes from. The KBs, either from the kbs or overrideKBs in LexiconNER.apply, have priorities, and the one with highest priority is recorded.