If the FastLexiconNERBuilder is beind used, indicates when true that a CompactLexiconNER should be created and otherwise a CombinedLexiconNER
When true indicates use of the FastLexiconNERBuilder and otherwise the SlowLexiconNERBuilder to construct the LexiconNER
Creates a LexiconNER from a list of KBs Note that file name (minus the extension) for each KB becomes the name of the corresponding category.
Creates a LexiconNER from a list of KBs Note that file name (minus the extension) for each KB becomes the name of the corresponding category. For example, /Some/Path/SomeCategory.tsv.gz yields the category name SomeCategory. Each of the KBs must contain one entity name per line
KBs containing known entity names
Filter which decides if a matched entity is valid
If true, we use Sentence.lemmas instead of Sentence.words during matching
If true, tokens are matched case insensitively
The new LexiconNER
Creates a LexiconNER from a list of KBs Note that file name (minus the extension) for each KB becomes the name of the corresponding category.
Creates a LexiconNER from a list of KBs Note that file name (minus the extension) for each KB becomes the name of the corresponding category. For example, /Some/Path/SomeCategory.tsv.gz yields the category name SomeCategory. Each of the KBs must contain one entity name per line
KBs containing known entity names
KBs containing override labels for entity names from kbs (necessary for the bio domain)
Filter which decides if a matched entity is valid
Generates alternative spellings of an entity name (necessary for the bio domain)
If true, we use Sentence.lemmas instead of Sentence.words during matching
If true, tokens are matched case insensitively
The new LexiconNER
This is just like the above but with the addition of the baseDirOpt.
Same apply with even more default values filled in
Same apply with more default values filled in
Same apply with some default values filled in
Create a LexiconNER from a pair of sequences of knowledge bases (KBs), the kbs and overrideKBs, with control over the case sensitivity of individual KBs via caseInsensitiveMatchings
Create a LexiconNER from a pair of sequences of knowledge bases (KBs), the kbs and overrideKBs, with control over the case sensitivity of individual KBs via caseInsensitiveMatchings
The matchings run parallel to the KBs. That is, caseInsensitiveMatchings(n) is used for kbs(n). It is possible that contents of an overrideKB refers to a KB that does not exist. In that situation, caseInsensitiveMatching is used as a fallback value.
KBs containing known entity names
KBs containing override labels for entity names from kbs (necessary for the bio domain)
case insensitivities corresponding to the kbs, matched by index
Filter which decides if a matched entity is valid
Generates alternative spellings of an entity name (necessary for the bio domain)
If true, we use Sentence.lemmas instead of Sentence.words during matching
If true, tokens are matched case insensitively
An optional directory to force kbs to be loaded from files rather than resources
The new LexiconNER
Create a LexiconNER from a pair of sequences of knowledge base sources for the kbs and overrideKBs.
Create a LexiconNER from a pair of sequences of knowledge base sources for the kbs and overrideKBs. There are versions of the sources for knowledge bases stored in files and those stored in memory. Each StandardKbSource knows its own caseSensitivityMatching, so no list of those need be supplied. It is possible that contents of an overrideKB refers to a KB (label) that does not exist. In that situation, caseInsensitiveMatching is used as a fallback value. With that, this method should encompass all the functionality of the other apply methods, which now feed into it. Note that some of the arugments are in a different order than the other build methods in order to overload the method despite type erasure.
KB sources containing known entity names
KB sources containing override labels for entity names from kbs (necessary for the bio domain)
Generates alternative spellings of an entity name (necessary for the bio domain)
Filter which decides if a matched entity is valid
If true, we use Sentence.lemmas instead of Sentence.words during matching
If true, tokens are matched case insensitively
The new LexiconNER
Merges labels from src into dst without overlapping any existing labels in dst.