Class HyperLogLogCollector

  • All Implemented Interfaces:
    Comparable<HyperLogLogCollector>
    Direct Known Subclasses:
    VersionOneHyperLogLogCollector, VersionZeroHyperLogLogCollector

    public abstract class HyperLogLogCollector
    extends Object
    implements Comparable<HyperLogLogCollector>
    Implements the HyperLogLog cardinality estimator described in: http://algo.inria.fr/flajolet/Publications/FlFuGaMe07.pdf Run this code to see a simple indication of expected errors based on different m values: for (int i = 1; i < 20; ++i) { System.out.printf("i[%,d], val[%,d] => error[%f%%]%n", i, 2 << i, 104 / Math.sqrt(2 << i)); } This class is *not* multi-threaded. It can be passed among threads, but it is written with the assumption that only one thread is ever calling methods on it. If you have multiple threads calling methods on this concurrently, I hope you manage to get correct behavior. Note that despite the non-thread-safety of this class, it is actually currently used by multiple threads during realtime indexing. HyperUniquesAggregator's "aggregate" and "get" methods can be called simultaneously by OnheapIncrementalIndex, since its "doAggregate" and "getMetricObjectValue" methods are not synchronized. So, watch out for that.
    • Constructor Detail

      • HyperLogLogCollector

        public HyperLogLogCollector​(ByteBuffer byteBuffer)
    • Method Detail

      • makeCollector

        public static HyperLogLogCollector makeCollector​(ByteBuffer buffer)
        Create a wrapper object around an HLL sketch contained within a buffer. The position and limit of the buffer may be changed; if you do not want this to happen, you can duplicate the buffer before passing it in. The mark and byte order of the buffer will not be modified.
        Parameters:
        buffer - buffer containing an HLL sketch starting at its position and ending at its limit
        Returns:
        HLLC wrapper object
      • makeCollectorSharingStorage

        public static HyperLogLogCollector makeCollectorSharingStorage​(HyperLogLogCollector otherCollector)
        Creates new collector which shares others collector buffer (by using ByteBuffer.duplicate())
        Parameters:
        otherCollector - collector which buffer will be shared
        Returns:
        collector
      • getLatestNumBytesForDenseStorage

        public static int getLatestNumBytesForDenseStorage()
      • makeEmptyVersionedByteArray

        public static byte[] makeEmptyVersionedByteArray()
      • applyCorrection

        public static double applyCorrection​(double e,
                                             int zeroCount)
      • estimateByteBuffer

        public static double estimateByteBuffer​(ByteBuffer buf)
      • getVersion

        public abstract byte getVersion()
      • setVersion

        public abstract void setVersion​(ByteBuffer buffer)
      • getRegisterOffset

        public abstract byte getRegisterOffset()
      • setRegisterOffset

        public abstract void setRegisterOffset​(byte registerOffset)
      • setRegisterOffset

        public abstract void setRegisterOffset​(ByteBuffer buffer,
                                               byte registerOffset)
      • getNumNonZeroRegisters

        public abstract short getNumNonZeroRegisters()
      • setNumNonZeroRegisters

        public abstract void setNumNonZeroRegisters​(short numNonZeroRegisters)
      • setNumNonZeroRegisters

        public abstract void setNumNonZeroRegisters​(ByteBuffer buffer,
                                                    short numNonZeroRegisters)
      • getMaxOverflowValue

        public abstract byte getMaxOverflowValue()
      • setMaxOverflowValue

        public abstract void setMaxOverflowValue​(byte value)
      • setMaxOverflowValue

        public abstract void setMaxOverflowValue​(ByteBuffer buffer,
                                                 byte value)
      • getMaxOverflowRegister

        public abstract short getMaxOverflowRegister()
      • setMaxOverflowRegister

        public abstract void setMaxOverflowRegister​(short register)
      • setMaxOverflowRegister

        public abstract void setMaxOverflowRegister​(ByteBuffer buffer,
                                                    short register)
      • getNumHeaderBytes

        public abstract int getNumHeaderBytes()
      • getNumBytesForDenseStorage

        public abstract int getNumBytesForDenseStorage()
      • getPayloadBytePosition

        public abstract int getPayloadBytePosition()
      • getPayloadBytePosition

        public abstract int getPayloadBytePosition​(ByteBuffer buffer)
      • getInitPosition

        protected int getInitPosition()
      • getStorageBuffer

        protected ByteBuffer getStorageBuffer()
      • add

        public void add​(byte[] hashedValue)
      • add

        public void add​(short bucket,
                        byte positionOf1)
      • toByteArray

        public byte[] toByteArray()
      • estimateCardinalityRound

        public long estimateCardinalityRound()
      • estimateCardinality

        public double estimateCardinality()
      • hashCode

        public int hashCode()
        Overrides:
        hashCode in class Object