alternate constructor to allow loading from a stream, possibly with a set of words to constrain the vocab
alternate constructor to allow loading from a source, possibly with a set of words to constrain the vocab
alternate constructor to allow loading from a file, possibly with a set of words to constrain the vocab
Finds the average word2vec similarity between any two words in these two texts IMPORTANT: words here must be words not lemmas!
Fetches the embeddings vector for a given word (not lemma)
Fetches the embeddings vector for a given word (not lemma)
The word
the array of embeddings weights
for a sequence of (word, weight) pairs, interpolate the vectors corresponding to the words by their respective weights, and normalize the resulting vector
Finds the maximum word2vec similarity between any two words in these two texts IMPORTANT: IMPORTANT: t1, t2 must be arrays of words, not lemmas!
Finds the words most similar to this set of inputs IMPORTANT: words here must already be normalized using Word2vec.sanitizeWord()!
filterPredicate: if passed, only returns words that match the predicate
Similar to sanitizedTextSimilarity, but but using the multiplicative heuristic of Levy and Goldberg (2014) IMPORTANT: words here must already be normalized using Word2vec.sanitizeWord()!
Similar to sanitizedTextSimilarity, but but using the multiplicative heuristic of Levy and Goldberg (2014) IMPORTANT: words here must already be normalized using Word2vec.sanitizeWord()!
Similarity value
Similar to textSimilarity, but using the multiplicative heuristic of Levy and Goldberg (2014) IMPORTANT: t1, t2 must be arrays of words, not lemmas!
Finds the average word2vec similarity between any two words in these two texts IMPORTANT: words here must already be normalized using Word2vec.sanitizeWord()! Changelog: (Peter/June 4/2014) Now returns words list of pairwise scores, for optional answer justification.
Finds the maximum word2vec similarity between any two words in these two texts IMPORTANT: words here must already be normalized using Word2vec.sanitizeWord()!
Finds the minimum word2vec similarity between any two words in these two texts IMPORTANT: words here must already be normalized using Word2vec.sanitizeWord()!
Computes the cosine similarity between two texts, according to the word2vec matrix IMPORTANT: words here must already be normalized using Word2vec.sanitizeWord()!
Computes the similarity between two given words IMPORTANT: words here must already be normalized using Word2vec.sanitizeWord()!
Computes the similarity between two given words IMPORTANT: words here must already be normalized using Word2vec.sanitizeWord()!
The first word
The second word
The cosine similarity of the two corresponding vectors
Computes the cosine similarity between two texts, according to the word2vec matrix IMPORTANT: t1, t2 must be arrays of words, not lemmas!
Implements similarity metrics using the word2vec matrix IMPORTANT: In our implementation, words are lower cased but NOT lemmatized or stemmed (see sanitizeWord) User: mihais, dfried, gus Date: 11/25/13 Last Modified: Fix compiler issue: import scala.io.Source.