Class SetDigest


  • public class SetDigest
    extends Object
    For the MinHash algorithm, see "On the resemblance and containment of documents" by Andrei Z. Broder, and the Wikipedia page: http://en.wikipedia.org/wiki/MinHash#Variant_with_a_single_hash_function
    • Constructor Detail

      • SetDigest

        public SetDigest()
      • SetDigest

        public SetDigest​(int maxHashes,
                         int numHllBuckets)
      • SetDigest

        public SetDigest​(int maxHashes,
                         io.airlift.stats.cardinality.HyperLogLog hll,
                         it.unimi.dsi.fastutil.longs.Long2ShortSortedMap minhash)
    • Method Detail

      • newInstance

        public static SetDigest newInstance​(io.airlift.slice.Slice serialized)
      • serialize

        public io.airlift.slice.Slice serialize()
      • getHll

        public io.airlift.stats.cardinality.HyperLogLog getHll()
      • estimatedInMemorySize

        public int estimatedInMemorySize()
      • estimatedSerializedSize

        public int estimatedSerializedSize()
      • isExact

        public boolean isExact()
      • cardinality

        public long cardinality()
      • exactIntersectionCardinality

        public static long exactIntersectionCardinality​(SetDigest a,
                                                        SetDigest b)
      • add

        public void add​(long value)
      • add

        public void add​(io.airlift.slice.Slice value)
      • mergeWith

        public void mergeWith​(SetDigest other)