minimum similarity two items need to have otherwise they are discarded from the result set
number of random vectors (hyperplanes) to generate bit vectors of length d
maximum number of catalogItems to be matched for a query when there is a matching LSH hash
number of hash rounds. The more the better the approximation but longer the computation.
The number of times that catalog set should be replicated in order to increase LSH recall.
The percentile of the query bucket size that is used as reference for the maximal bucket size. Any bucket bigger than that will be split in sub-buckets that are smaller.
Find the k nearest neighbours in catalogMatrix for each entry in queryMatrix.
Find the k nearest neighbours in catalogMatrix for each entry in queryMatrix. Implementations may be either exact or approximate.
a row oriented matrix. Each row in the matrix represents an item in the data set. Items are identified by their matrix index.
a row oriented matrix. Each row in the matrix represents an item in the data set. Items are identified by their matrix index.
a similarity matrix with MatrixEntry(queryA, catalogB, similarity).
Standard Lsh implementation. The queryMatrix is hashed multiple times and exact hash matches are searched for in the dbMatrix. These candidates are used to compute the cosine distance.