public class DiacriticInsensitiveLevenshtein extends AbstractEditDistance
Levenshtein
EditDistance
insensitive to diacritics, i.e.
pairs of words such as café
and cafe
,
joão
and joao
will be considered to have a
0
edit distance or 1
similarity.Modifier and Type | Field and Description |
---|---|
static int |
FastFailures |
Constructor and Description |
---|
DiacriticInsensitiveLevenshtein(java.util.Locale locale) |
Modifier and Type | Method and Description |
---|---|
int |
compute(java.lang.String str,
java.lang.String rst) |
boolean |
diacriticInsensitiveEquals(char char1,
char char2)
Determines whether
char1 and char2 are equals
independent of the presence of diacritic marks. |
boolean |
isFailFast() |
double |
normalize(int distance,
java.lang.String str,
java.lang.String rst)
Normalizes the specified
distance by
max(|str|, |rst|) . |
void |
setFailThreshold(double threshold) |
computeNormalized
public DiacriticInsensitiveLevenshtein(java.util.Locale locale)
public double normalize(int distance, java.lang.String str, java.lang.String rst)
distance
by
max(|str|, |rst|)
. For historical reasons this method
actually returns 1 - normalized distance, making a similarity.distance
- The edit distance between str
and
rst
.str
- A stringrst
- Another string1 - distance/max(|str|, |rst|)
.public int compute(java.lang.String str, java.lang.String rst)
public boolean diacriticInsensitiveEquals(char char1, char char2)
char1
and char2
are equals
independent of the presence of diacritic marks.char1
- The first charchar2
- The second chartrue
if char1
and char2
are equals, or false
otherwise.public boolean isFailFast()
public void setFailThreshold(double threshold)