A marker interface for the cosine similarity algorithm.
Jave Wrapper for cosine similarity.
A marker interface for the damerau levenshtein distance algorithm.
A marker interface for the dice coefficient algorithm.
Jave Wrapper for dice coefficient similarity.
A type class to extend a distance method to StringMetricAlgorithm.
A marker interface for the hamming distance algorithm.
Jave Wrapper for hamming distance.
A marker interface for a jaccard similarity algorithm.
Jave Wrapper for jaccard similarity.
A marker interface for the jaro similarity algorithm.
Jave Wrapper for jaro and jaro winkler similarity.
A marker interface for the jaro winkler algorithm.
A marker interface for the levenshtein distance algorithm.
Jave Wrapper for levenshtein distance.
A marker interface for the longest common subsequence algorithm.
Jave Wrapper for longest comment sequence.
A marker interface for the metaphone algorithm.
Jave Wrapper for metaphone similarity.
A marker interface for the n-gram similarity algorithm.
Jave Wrapper for n-gram similarity.
A marker interface for the needleman wunsch similarity algorithm.
Jave Wrapper for needleman wunsch similarity.
A marker interface for the overlap similarity algorithm.
Jave Wrapper for overlap similarity.
A mix-in trait to extend a score method using the distance method to StringMetricAlgorithm.
A type class to extend a score method to StringMetricAlgorithm.
A marker interface for the smith waterman similarity algorithm.
A marker interface for the smith waterman gotoh similarity algorithm.
Jave Wrapper for smith waterman similarity.
A type class to extend a sound score method to StringMetricAlgorithm.
A marker interface for the soundex similarity algorithm.
Jave Wrapper for soundex similarity.
Defines implementation for StringMetricAlgorithm by adding implicit definitions from DistanceAlgorithm, ScoringAlgorithm, WeightedDistanceAlgorithm, or WeightedScoringAlgorithm
A marker interface for the string metric algorithm.
A marker interface for the tversky similarity algorithm.
A type class to extend a distance method with a 2nd typed parameter to StringMetricAlgorithm.
A type class to extend a score method with a 2nd typed parameter to StringMetricAlgorithm.
Implicit definition of cosine similarity score for CosineAlgorithm.
Implicit definition of cosine similarity score for CosineAlgorithm.
Implicit definition of damerau levenshtein distance for DamerauLevenshteinAlgorithm.
Implicit definition of damerau levenshtein distance for DamerauLevenshteinAlgorithm.
Implicit definition of dice coefficient score for DiceCoefficientAlgorithm.
Implicit definition of dice coefficient score for DiceCoefficientAlgorithm.
Implicit definition of hamming distance for HammingAlgorithm.
Implicit definition of hamming distance for HammingAlgorithm.
Implicit definition of jaccard score for JaccardAlgorithm.
Implicit definition of jaccard score for JaccardAlgorithm.
Implicit definition of jaro score for JaroAlgorithm.
Implicit definition of jaro score for JaroAlgorithm.
Implicit definition of jaro winkler score for JaroWinklerAlgorithm.
Implicit definition of jaro winkler score for JaroWinklerAlgorithm.
Implicit definition of levenshtein distance for LevenshteinAlgorithm.
Implicit definition of levenshtein distance for LevenshteinAlgorithm.
Implicit definition of longest common subsequence for CosineAlgorithm.
Implicit definition of longest common subsequence for CosineAlgorithm.
Implicit definition of metaphone score for MetaphoneAlgorithm.
Implicit definition of metaphone score for MetaphoneAlgorithm.
Implicit definition of n-gram distance for NGramAlgorithm.
Implicit definition of n-gram distance for NGramAlgorithm.
Implicit definition of n-gram score for NGramAlgorithm.
Implicit definition of n-gram score for NGramAlgorithm.
Implicit definition of needleman wunsch score for NeedlemanWunschAlgorithm.
Implicit definition of needleman wunsch score for NeedlemanWunschAlgorithm.
Implicit definition of overlap score for OverlapAlgorithm.
Implicit definition of overlap score for OverlapAlgorithm.
Implicit definition of smith waterman gotoh score for SmithWatermanGotohAlgorithm.
Implicit definition of smith waterman gotoh score for SmithWatermanGotohAlgorithm.
Implicit definition of smith waterman score for SmithWatermanAlgorithm.
Implicit definition of smith waterman score for SmithWatermanAlgorithm.
Implicit definition of soundex score for SoundexAlgorithm.
Implicit definition of soundex score for SoundexAlgorithm.
The Strategy object has two strategies(reg ex) expressions on which to split input.
The Strategy object has two strategies(reg ex) expressions on which to split input. Strategy.splitWord splits a word into a sequence of characters. Strategy.splitSentence splits a sentence into a sequence of words.
Object to extend operations to the String class.
Object to extend operations to the String class.
import com.github.vickumar1981.stringdistance.StringConverter._ // Scores between two strings val cosSimilarity: Double = "hello".cosine("chello") val damerau: Double = "martha".damerau("marhta") val diceCoefficient: Double = "martha".diceCoefficient("marhta") val hamming: Double = "martha".hamming("marhta") val jaccard: Double = "karolin".jaccard("kathrin") val jaro: Double = "martha".jaro("marhta") val jaroWinkler: Double = "martha".jaroWinkler("marhta") val levenshtein: Double = "martha".levenshtein("marhta") val needlemanWunsch: Double = "martha".needlemanWusnch("marhta") val ngramSimilarity: Double = "karolin".nGram("kathrin") val bigramSimilarity: Double = "karolin".nGram("kathrin", 2) val overlap: Double = "karolin".overlap("kathrin") val smithWaterman: Double = "martha".smithWaterman("marhta") val smithWatermanGotoh: Double = "martha".smithWatermanGotoh("marhta") val tversky: Double = "karolin".tversky("kathrin", 0.5) // Distances between two strings val damerauDist: int = "martha".damerauDist("marhta") val hammingDist: Int = "martha".hammingDist("marhta") val levenshteinDist: Int = "martha".levenshteinDist("marhta") val longestCommonSeq: Int = "martha".longestCommonSeq("marhta") val ngramDist: Int = "karolin".nGramDist("kathrin") val bigramDist: Int = "karolin".nGramDist("kathrin", 2) // Phonetic similarity of two strings val metaphone: Boolean = "merci".metaphone("mercy") val soundex: Boolean = "merci".soundex("mercy")
Main class to organize functionality of different string distance algorithms
Main class to organize functionality of different string distance algorithms
import com.github.vickumar1981.stringdistance.Strategy import com.github.vickumar1981.stringdistance.StringDistance._ import com.github.vickumar1981.stringdistance.impl.{ConstantGap, LinearGap} // Scores between strings val cosSimilarity: Double = Cosine.score("hello", "chello", Strategy.splitWord) val damerau: Double = Damerau.score("martha", "marhta") val diceCoefficient: Double = DiceCoefficient.score("martha", "marhta") val hamming: Double = Hamming.score("martha", "marhta") val jaccard: Double = Jaccard.score("karolin", "kathrin", 1) val jaro: Double = Jaro.score("martha", "marhta") val jaroWinkler: Double = JaroWinkler.score("martha", "marhta", 0.1) val levenshtein: Double = Levenshtein.score("martha", "marhta") val needlemanWunsch: Double = NeedlemanWunsch.score("martha", "marhta", ConstantGap()) val ngramSimilarity: Double = NGram.score("karolin", "kathrin", 1) val bigramSimilarity: Double = NGram.score("karolin", "kathrin", 2) val overlap: Double = Overlap.score("karolin", "kathrin", 1) val smithWaterman: Double = SmithWaterman.score("martha", "marhta", (LinearGap(gapValue = -1), Integer.MAX_VALUE)) val smithWatermanGotoh: Double = SmithWatermanGotoh.score("martha", "marhta", ConstantGap()) val tversky: Double = Tversky.score("karolin", "kathrin", 0.5) // Distances between strings val damerauDist: Int = Damerau.distance("martha", "marhta") val hammingDist: Int = Hamming.distance("martha", "marhta") val levenshteinDist: Int = Levenshtein.distance("martha", "marhta") val longestCommonSubSeq: Int = LongestCommonSeq.distance("martha", "marhta") val ngramDist: Int = NGram.distance("karolin", "kathrin", 1) val bigramDist: Int = NGram.distance("karolin", "kathrin", 2)
Main class to organize functionality of different phonetic/sound string algorithms
Main class to organize functionality of different phonetic/sound string algorithms
import com.github.vickumar1981.stringdistance.StringSound._ import com.github.vickumar1981.stringdistance.implicits._ // Phonetic similarity between strings val metaphone: Boolean = Metaphone.score("merci", "mercy") val soundex: Boolean = Soundex.score("merci", "mercy")
Implicit definition of tversky score for TverskyAlgorithm.
Implicit definition of tversky score for TverskyAlgorithm.
Provides classes for calculating distances and fuzzy match similarities between two strings. Also provides implicits for using distance and fuzzy match scores as an operator, like:
Includes functionality for phonetic comparisons between strings.
Overview
The main class to use is com.github.vickumar1981.stringdistance.StringDistance
If you include com.github.vickumar1981.stringdistance.StringConverter, you can convert/use the string distance and score functions as an operator between two strings.
To compare two strings phonetically, i.e. if they sound alike, use the com.github.vickumar1981.stringdistance.util.StringSound class.
To use in Java, please use the corresponding classes in the com.github.vickumar1981.stringdistance.util package.
| Class | Description | | :--- | :--- | | com.github.vickumar1981.stringdistance.StringDistance | Singleton class with fuzzy match scores and distances | | com.github.vickumar1981.stringdistance.StringConverter | Implicit converstions between strings s1 and s2 | | com.github.vickumar1981.stringdistance.StringSound | Phonetic comparison between strings s1 and s2 | | com.github.vickumar1981.stringdistance.util.StringDistance | Java class for fuzzy match scores and distances | | com.github.vickumar1981.stringdistance.util.StringSound | Java class for phonetic comparison between strings s1 and s2 |