Package io.prestosql.type.setdigest
Class SetDigest
- java.lang.Object
-
- io.prestosql.type.setdigest.SetDigest
-
public class SetDigest extends Object
For the MinHash algorithm, see "On the resemblance and containment of documents" by Andrei Z. Broder, and the Wikipedia page: http://en.wikipedia.org/wiki/MinHash#Variant_with_a_single_hash_function
-
-
Field Summary
Fields Modifier and Type Field Description static int
DEFAULT_MAX_HASHES
static int
NUMBER_OF_BUCKETS
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description void
add(long value)
void
add(io.airlift.slice.Slice value)
long
cardinality()
int
estimatedInMemorySize()
int
estimatedSerializedSize()
static long
exactIntersectionCardinality(SetDigest a, SetDigest b)
Map<Long,Short>
getHashCounts()
io.airlift.stats.cardinality.HyperLogLog
getHll()
boolean
isExact()
static double
jaccardIndex(SetDigest a, SetDigest b)
void
mergeWith(SetDigest other)
static SetDigest
newInstance(io.airlift.slice.Slice serialized)
io.airlift.slice.Slice
serialize()
-
-
-
Field Detail
-
NUMBER_OF_BUCKETS
public static final int NUMBER_OF_BUCKETS
- See Also:
- Constant Field Values
-
DEFAULT_MAX_HASHES
public static final int DEFAULT_MAX_HASHES
- See Also:
- Constant Field Values
-
-
Method Detail
-
newInstance
public static SetDigest newInstance(io.airlift.slice.Slice serialized)
-
serialize
public io.airlift.slice.Slice serialize()
-
getHll
public io.airlift.stats.cardinality.HyperLogLog getHll()
-
estimatedInMemorySize
public int estimatedInMemorySize()
-
estimatedSerializedSize
public int estimatedSerializedSize()
-
isExact
public boolean isExact()
-
cardinality
public long cardinality()
-
exactIntersectionCardinality
public static long exactIntersectionCardinality(SetDigest a, SetDigest b)
-
add
public void add(long value)
-
add
public void add(io.airlift.slice.Slice value)
-
mergeWith
public void mergeWith(SetDigest other)
-
-