minimum similarity two items need to have otherwise they are discarded from the result set
number of random vectors (hyperplanes) to generate bit vectors of length d. Should be considerably large in order to get good approximations of the cosine distances e.g. 500 if input size = 150
number of results for each entry in query matrix
Find the k nearest neighbours in catalogMatrix for each entry in queryMatrix.
Find the k nearest neighbours in catalogMatrix for each entry in queryMatrix. Implementations may be either exact or approximate.
a row oriented matrix. Each row in the matrix represents an item in the data set. Items are identified by their matrix index.
a row oriented matrix. Each row in the matrix represents an item in the data set. Items are identified by their matrix index.
a similarity matrix with MatrixEntry(queryA, catalogB, similarity).
Implementation based on approximated cosine distances. The cosine distances are approximated using hamming distances which are way faster to compute. The catalog matrix is broadcasted. This implementation is therefore suited for tasks where the catalog matrix is very small compared to the query matrix.