com.soundcloud

lsh

package lsh

Visibility
  1. Public
  2. All

Type Members

  1. trait Joiner extends AnyRef

    Find the k nearest neighbors from a dataset for every other object in the same dataset.

  2. class Lsh extends Joiner with Serializable

    Lsh implementation as described in 'Randomized Algorithms and NLP: Using Locality Sensitive Hash Function for High Speed Noun Clustering' by Ravichandran et al.

  3. class NearestNeighbours extends Joiner with Serializable

    Brute force O(n2) method to compute exact nearest neighbours. As this is a very expensive computation O(n2) an additional sample parameter may be passed such that neighbours are just computed for a random fraction.

  4. class SlidingRDD[T] extends RDD[Seq[T]]

    Represents a RDD from grouping items of its parent RDD in fixed size blocks by passing a sliding window over them.

  5. class SlidingRDDPartition[T] extends Partition with Serializable

    NOTE: both classes are copied from mllib and slightly modified since these classes are mllib private!

  6. trait VectorDisctance extends Serializable

    interface defining similarity measurement between 2 vectors

Value Members

  1. object Cosine extends VectorDisctance

    implementation of VectorDisctance that computes cosine similarity between two vectors

  2. object Lsh extends Serializable

  3. object Main

Ungrouped