Interface | Description |
---|---|
AnnotationOrthography | |
OrthoMatcherRule |
Class | Description |
---|---|
BasicAnnotationOrthography | |
MatchRule0 |
RULE #0: If the two names are listed in table of
spurius matches then they do NOT match
Condition(s): -
Applied to: all name annotations
|
MatchRule1 |
RULE #1: If the two names are identical then they are the same
no longer used, because I do the check for same string via the
hash table of previous annotations
Condition(s): depend on case
Applied to: annotations other than names
|
MatchRule10 |
RULE #10: is one name the reverse of the other
reversing around prepositions only?
|
MatchRule11 |
RULE #11: does one name consist of contractions
of the first two tokens of the other name?
|
MatchRule12 |
RULE #12: do the first and last tokens of one name
match the first and last tokens of the other?
|
MatchRule13 |
RULE #12: do the first and last tokens of one name
match the first and last tokens of the other?
|
MatchRule14 |
RULE #13: do multi-word names match except for
one token e.g.
|
MatchRule15 |
RULE #14: if the last token of one name
matches the second name
e.g.
|
MatchRule16 |
RULE #15: Does every token in the shorter name appear in the longer name?
|
MatchRule17 |
RULE #16: Conservative match rule
Require every token in one name to match the other except for tokens that are on a stop word list
|
MatchRule2 |
RULE #2: if the two names are listed as equivalent in the
lookup table (alias) then they match
Condition(s): -
Applied to: all name annotations
|
MatchRule3 |
RULE #3: adding a possessive at the end
of one name causes a match
e.g.
|
MatchRule4 |
RULE #4: Does the first non-punctuation token from the long string match
the first token from the short string?
|
MatchRule5 |
RULE #4Name: Does all the non-punctuation tokens from the long string match the corresponding tokens
in the short string?
|
MatchRule6 |
RULE #5: if the 1st token of one name
matches the second name
e.g.
|
MatchRule7 |
RULE #6: if one name is the acronym of the other
e.g.
|
MatchRule8 |
RULE #7: if one of the tokens in one of the
names is in the list of separators eg. "&"
then check if the token before the separator
matches the other name
e.g.
|
MatchRule9 |
RULE #9: does one of the names match the token
just before a trailing company designator
in the other name?
|
OrthoMatcher | |
OrthoMatcherHelper |